US 12,308,021 B2
Punctuation mark delete model training device, punctuation mark delete model, and determination device
Taku Katou, Chiyoda-ku (JP)
Assigned to NTT DOCOMO, INC., Chiyoda-ku (JP)
Appl. No. 17/995,529
Filed by NTT DOCOMO, INC., Chiyoda-ku (JP)
PCT Filed Apr. 8, 2021, PCT No. PCT/JP2021/014931
§ 371(c)(1), (2) Date Oct. 5, 2022,
PCT Pub. No. WO2021/215262, PCT Pub. Date Oct. 28, 2021.
Claims priority of application No. 2020-074788 (JP), filed on Apr. 20, 2020.
Prior Publication US 2023/0223017 A1, Jul. 13, 2023
Int. Cl. G10L 15/18 (2013.01); G06F 16/33 (2025.01); G06F 40/232 (2020.01); G10L 15/22 (2006.01); G10L 15/26 (2006.01); G10L 25/93 (2013.01)
CPC G10L 15/1822 (2013.01) [G06F 16/33 (2019.01); G06F 40/232 (2020.01); G10L 15/22 (2013.01); G10L 15/26 (2013.01); G10L 25/93 (2013.01)] 9 Claims
OG exemplary drawing
 
1. A punctuation mark delete model learning device for generating, through machine learning, a punctuation mark delete model for determining whether or not a punctuation mark assigned to text obtained by speech recognition processing is correct,
wherein the punctuation mark delete model receives two consecutive sentences of a first sentence with a punctuation mark assigned at an end of the sentence and a second sentence following the first sentence, and outputs a probability indicating whether the punctuation mark assigned at an end of the first sentence is correct, and
the punctuation mark delete model learning device comprises circuitry configured to:
generate first learning data consisting of a pair of an input sentence including a preceding sentence, the preceding sentence being a sentence with a punctuation mark assigned at an end of the sentence, and a subsequent sentence, the subsequent sentence being a sentence following the punctuation mark in text constituting a first text corpus, and a label indicating whether or not the assignment of the punctuation mark is correct on the basis of the first text corpus, the first text corpus being text including of one or more sentences obtained by speech recognition processing and having a punctuation mark assigned thereto on the basis of information obtained by speech recognition processing; and
update parameters of the punctuation mark delete model on the basis of an error between the probability obtained by inputting the input sentences of the first learning data to the punctuation mark delete model and the label associated with the input sentence.