| CPC G06F 40/56 (2020.01) [G06F 40/284 (2020.01); G10L 15/08 (2013.01)] | 18 Claims |

|
1. A system comprising:
a processor; and
a computer-readable medium storing instructions that are operative upon execution by the processor to:
receive a stream of tokens, each token representing an element of human speech;
chunk the stream of tokens;
tag, by a tagger, the stream of tokens with one or more tags of a plurality of tags to produce a tagged stream of tokens by chunks in a streaming manner, each tag of the plurality of tags representing a different normalization category of a plurality of normalization categories, wherein the tagger comprises a neural network using self-attention to compute representations of input and output;
detect, by each category-specific natural language converter of a plurality of category-specific natural language converters, each of the plurality of category-specific natural language converters comprising a weighted finite state transducer (WFST), from the tagged stream of tokens, a tag representing a normalization category of the plurality of normalization categories upon which each category-specific natural language converter is trained to operate, wherein each category-specific natural language converter is trained for a single normalization category of the plurality of normalization categories by each respective trainer of a plurality of trainers;
upon detecting a first tag representing a first normalization category, convert, by a first language converter of the plurality of category-specific natural language converters, a first token of the tagged stream of tokens, from a first lexical language form to a first natural language form, wherein the first language converter is trained to operate upon the first normalization category, and wherein the first token is associated with the first tag;
upon detecting a second tag representing a second normalization category, convert, in parallel with converting by the first language converter, by a second language converter of the plurality of category-specific natural language converters, a second token of the tagged stream of tokens from a second lexical language form to a second natural language form, wherein the second language converter is trained to operate upon the second normalization category, and wherein the second token is associated with the second tag; and
based on at least the first natural language form, output a natural language representation of the stream of tokens.
|