US 12,314,677 B2
Method for pre-training model, device, and storage medium
Junyuan Shang, Beijing (CN); Shuohuan Wang, Beijing (CN); Siyu Ding, Beijing (CN); Yanbin Zhao, Beijing (CN); Chao Pang, Beijing (CN); and Yu Sun, Beijing (CN)
Assigned to BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Filed by BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Filed on Aug. 16, 2022, as Appl. No. 17/889,218.
Claims priority of application No. 202111260446.4 (CN), filed on Oct. 28, 2021.
Prior Publication US 2023/0040095 A1, Feb. 9, 2023
Int. Cl. G06F 40/40 (2020.01); G06F 40/289 (2020.01)
CPC G06F 40/40 (2020.01) [G06F 40/289 (2020.01)] 20 Claims
OG exemplary drawing
 
10. An electronic device, comprising:
at least one processor; and
a memory communicatively connected to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor, and the instructions, when executed by the at least one processor, cause the at least one processor to perform operations, the operations comprising:
acquiring a sample natural language text;
generating N types of prompt words based on the sample natural language text, wherein N is a positive integer, and the N types comprise at least one of a task type, a topic type, a key phrase type, or a sentiment type;
generating sample input data based on the sample natural language text and the N types of prompt words; and
training an initial language model based on the sample input data, to obtain a pre-trained language model,
wherein the generating sample input data based on the sample natural language text and the N types of prompt words, comprises:
generating random sampling probabilities of the N types of prompt words respectively;
selecting, from the N types of prompt words, a prompt word whose random sampling probability is greater than a preset probability threshold;
intercepting a sample prefix text fragment from the sample natural language text; and
splicing the selected prompt word with the sample prefix text fragment to generate the sample input data.