US 12,412,564 B2
Text data processing method and apparatus
Tong Cui, Shenzhen (CN); Jinghui Xiao, Xi'an (CN); and Liangyou Li, Hong Kong (CN)
Assigned to Huawei Technologies Co., Ltd, Shenzhen (CN)
Filed by HUAWEI TECHNOLOGIES CO., LTD., Guangdong (CN)
Filed on Jan. 6, 2023, as Appl. No. 18/151,186.
Application 18/151,186 is a continuation of application No. PCT/CN2021/104902, filed on Jul. 7, 2021.
Claims priority of application No. 202010662105.9 (CN), filed on Jul. 10, 2020.
Prior Publication US 2023/0162723 A1, May 25, 2023
Int. Cl. G06F 17/00 (2019.01); G06F 40/279 (2020.01); G06F 40/30 (2020.01); G06F 40/40 (2020.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/20 (2006.01); G10L 15/22 (2006.01)
CPC G10L 15/063 (2013.01) [G06F 40/279 (2020.01); G06F 40/30 (2020.01); G06F 40/40 (2020.01); G10L 15/16 (2013.01); G10L 15/20 (2013.01); G10L 15/22 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A text data processing method, comprising:
obtaining speech data and a first text, wherein the first text is a correct text corresponding to the speech data;
performing automatic speech recognition (ASR) on the speech data based on a first speech recognition model to obtain a second text;
processing the first text based on an initial noise generation model to obtain an output text;
obtaining a loss based on the output text and the second text;
updating the initial noise generation model based on the loss until the loss meets a preset condition to obtain a noise generation model, wherein the noise generation model is at least one of the following: a bidirectional long short-term memory (LSTM), a generative pre-training (GPT) model, or a Laser Tagger model;
obtaining a target text;
processing the target text based on the noise generation model to obtain a noisy text; and
training a text processing model, by using at least the noisy text as training data, to obtain a trained text processing model, wherein the trained text processing model is used to perform at least one of the following tasks: text translation, text semantic recognition, text classification, automatic question answering, information recommendation, or text emotion recognition.