US 12,277,926 B2
Intelligent medical speech automatic recognition method and system thereof
Der-Yang Cho, Taichung (TW); Kai-Cheng Hsu, Taichung (TW); Ya-Lun Wu, Taichung (TW); and Kai-Ching Chen, Taichung (TW)
Assigned to China Medical University, Taichung (TW)
Filed by China Medical University, Taichung (TW)
Filed on Sep. 29, 2021, as Appl. No. 17/488,658.
Claims priority of application No. 110122805 (TW), filed on Jun. 22, 2021.
Prior Publication US 2022/0406296 A1, Dec. 22, 2022
Int. Cl. G10L 15/06 (2013.01); G10L 15/22 (2006.01); G10L 15/26 (2006.01); G10L 21/0208 (2013.01)
CPC G10L 15/063 (2013.01) [G10L 15/22 (2013.01); G10L 15/26 (2013.01); G10L 21/0208 (2013.01)] 2 Claims
OG exemplary drawing
 
1. An intelligent medical speech automatic recognition method, comprising:
performing a first model training step to drive a processing unit to train a generic statement data and a medical statement data of a database to establish a first model;
performing a second model training step to drive the processing unit to train a medical textbook data of the database to establish a second model;
performing a voice receiving step to drive a voice receiver to receive a speech signal, wherein the voice receiver is signally connected to the processing unit;
performing a signal pre-treatment step to drive the processing unit to receive the speech signal from the voice receiver and transform the speech signal into a to-be-recognized speech signal; and
performing a transforming step to drive the processing unit to transform and recognize the to-be-recognized speech signal into a complete sentence writing character according to the first model and the second model;
wherein the generic statement data, the medical statement data and the medical textbook data are different from each other;
wherein the transforming step comprises:
performing a first transforming step to drive the processing unit to transform the to-be-recognized speech signal into a writing character according to the first model; and
performing a second transforming step to drive the processing unit to transform the writing character into the complete sentence writing character according to the second model without transforming the speech signal into a phonography;
wherein the complete sentence writing character comprises at least one punctuation;
wherein the first model is trained to recognize a generic vocabulary and a medical field vocabulary;
wherein the medical statement data comprises a plurality of medical vocabulary speech signals and a plurality of medical vocabulary writing characters corresponding to the medical vocabulary speech signals; and
wherein the medical vocabulary writing characters comprise at least one hybrid vocabulary with Chinese words and English words;
wherein the signal pre-treatment step comprises:
performing a noise filtering step to drive the processing unit to filter out a noise of the speech signal, and generate a human voice interval signal; and
performing a target interval enhancing step to drive the processing unit to enhance the human voice interval signal according to a human voice frequency band, and generate the to-be-recognized speech signal;
wherein the generic statement data comprises a plurality of generic vocabulary speech signals and a plurality of generic vocabulary writing characters corresponding to the generic vocabulary speech signals; and
wherein the first model training step, the second model training step, the voice receiving step, the signal pre-treatment step, the first transforming step and the second transforming step of the transforming step are performed in sequence.