US 12,067,347 B2
Sentence generation method and apparatus, device, and storage medium
Yizhang Tan, Shenzhen (CN); Jiachen Ding, Shenzhen (CN); and Changyu Miao, Shenzhen (CN)
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed by TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed on Apr. 14, 2021, as Appl. No. 17/230,985.
Application 17/230,985 is a continuation of application No. PCT/CN2020/073407, filed on Jan. 21, 2020.
Claims priority of application No. 201910068987.3 (CN), filed on Jan. 24, 2019.
Prior Publication US 2021/0232751 A1, Jul. 29, 2021
Int. Cl. G06F 40/20 (2020.01); G06F 40/12 (2020.01); G06F 40/253 (2020.01); G06F 40/289 (2020.01); G06N 3/084 (2023.01); G10L 15/22 (2006.01); G06F 40/205 (2020.01); G06F 40/40 (2020.01); G06F 40/56 (2020.01)
CPC G06F 40/12 (2020.01) [G06F 40/253 (2020.01); G06F 40/289 (2020.01); G06N 3/084 (2013.01); G06F 40/205 (2020.01); G06F 40/40 (2020.01); G06F 40/56 (2020.01); G10L 15/22 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A sentence generation method, performed by a machine translation system, the method comprising:
obtaining an input sequence, wherein the input sequence is of a first language type;
encoding the input sequence to obtain a sentence eigenvector;
decoding the sentence eigenvector to obtain a first predetermined quantity of candidate sentence sequences;
clustering the first predetermined quantity of candidate sentence sequences to obtain sentence sequence sets of at least two types;
selecting a second predetermined quantity of candidate sentence sequences from the sentence sequence sets of at least two types, the second predetermined quantity of candidate sentence sequences including at least two sentence feature types wherein a first sentence feature type indicates that a sentence grammaticality of the candidate sentence sequence exceeds a grammaticality threshold, a target sentence feature type indicates that the sentence grammaticality of the candidate sentence sequence exceeds the grammaticality threshold and an association between the candidate sentence and the input sentence exceeds an association threshold; and
determining an output sequence corresponding to the input sequence according to the second predetermined quantity of candidate sentence sequences, wherein the output sequence is of a second language type.