| CPC G06F 40/40 (2020.01) [G06F 16/3331 (2019.01); G06F 40/284 (2020.01)] | 20 Claims |

|
1. A method of text generation in a natural language processing (NLP) model, the method comprising:
receiving, via a data interface, a natural language input text for completing an NLP task;
encoding, via an encoder of the NLP model, the natural language input text into a text representation;
retrieving, from a memory, a directed search graph built based on a vocabulary, wherein the directed search graph includes a plurality of nodes representing tokens from the vocabulary;
performing, by a decoder of the NLP model, parallel searching on the directed search graph along multiple decoding paths to generate a sequence of output tokens, including:
generating, via the decoder, a plurality of next-node probabilities for a set of candidate nodes that are next to K previously decoded paths of nodes on the directed search graph based on the text representation, and
computing, for the K parallel decoded paths, respective scores for candidate nodes corresponding to the K previously decoded paths based on next-node probabilities, wherein the generating and the computing are performed in parallel among the K previously decoded paths;
selecting K nodes having highest scores from at least the candidate nodes as next nodes for the K previously decoded paths, respectively;
completing the parallel searching when a search budget is exhausted; and
generating an output of sequences of tokens based on paths of nodes on the directed search graph, the paths of nodes being generated in parallel.
|