US 12,277,397 B2
	Method of training model, method of determining word vector, device, medium, and product
Chao Ma, Beijing (CN); Jingshuai Zhang, Beijing (CN); Qifan Huang, Beijing (CN); Kaichun Yao, Beijing (CN); Peng Wang, Beijing (CN); and Hengshu Zhu, Beijing (CN)
Assigned to BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Filed by Beijing Baidu Netcom Science Technology Co., Ltd., Beijing (CN)
Filed on Dec. 29, 2021, as Appl. No. 17/564,369.
Claims priority of application No. 202110277972.5 (CN), filed on Mar. 15, 2021.
Prior Publication US 2022/0121826 A1, Apr. 21, 2022
Int. Cl. G06F 40/40 (2020.01); G06F 40/30 (2020.01); G06N 7/01 (2023.01)

CPC G06F 40/40 (2020.01) [G06F 40/30 (2020.01); G06N 7/01 (2023.01)]

16 Claims

1. A method of training a model, comprising:

acquiring a first word vector set corresponding to a first word set, wherein the first word set is acquired from a first corpus, and words in the first word set have a non-sequential relationship in linguistics;

for each word vector in the first word vector set,

inputting the word vector to a word embedding model to generate a reduced-dimensional word vector based on the word embedding model,

generating, for other word vector in the first word vector set, a first probability distribution in the first word vector set based on the reduced-dimensional word vector, wherein the other word vector is a word vector in the first word vector set except the word vector input to the word embedding model, and

adjusting a parameter of the word embedding model so as to minimize a difference between the first probability distribution generated using an adjusted word embedding model and a second probability distribution for the other word vector determined by a number of word vector in the first word vector set,

determining, for each word in the first word set, a contrast word set in a complete word set, wherein the first word set belongs to the complete word set, and the words in the first word set are not included in the contrast word set;

acquiring a contrast word vector set corresponding to the contrast word set;

generating, by using the word embedding model, a probability of each word vector in the contrast word set appearing in the first word vector set; and

adjusting the parameter so as to minimize the probability of each word vector in the contrast word set appearing in the first word vector set, which is generated using the adjusted word embedding model, wherein

the determining of the contrast word set includes:

determining a sampling probability according to an appearance number of the word appearing in the first corpus and an appearance number of each word in the complete word set appearing in the first corpus; and

sampling in words in the complete word set other than words in the first word set by using the sampling probability, so as to determine the contrast word set.