US 11,989,516 B2
Method and apparatus for acquiring pre-trained model, electronic device and storage medium
Lijie Wang, Beijing (CN); Shuai Zhang, Beijing (CN); Xinyan Xiao, Beijing (CN); Yue Chang, Beijing (CN); and Tingting Li, Beijing (CN)
Assigned to BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Filed by BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Filed on Jan. 10, 2022, as Appl. No. 17/572,068.
Claims priority of application No. 202110734498.4 (CN), filed on Jun. 30, 2021.
Prior Publication US 2023/0004717 A1, Jan. 5, 2023
Int. Cl. G06F 40/289 (2020.01); G06N 20/00 (2019.01)
CPC G06F 40/289 (2020.01) [G06N 20/00 (2019.01)] 16 Claims
OG exemplary drawing
 
1. A method for acquiring a pre-trained model, comprising:
adding, in a process of training a pre-trained model using training sentences, a learning objective corresponding to syntactic information for a self-attention module in the pre-trained model; and
training the pre-trained model according to the learning objective,
wherein the learning objective may comprise one or both of a first learning objective and a second learning objective,
wherein the first learning objective indicates that:
for any term x in the training sentence, a first weight corresponding to the term x is required to be greater than a second weight; the first weight is an attention weight between the term x and any term y which is associated with the term x through a direct path in a dependency tree corresponding to the training sentence, and the second weight is an attention weight between the term x and any term z which is associated with the term x through a weak path or is not associated therewith through a path in the dependency tree.