US 12,277,926 B2
	Intelligent medical speech automatic recognition method and system thereof
Der-Yang Cho, Taichung (TW); Kai-Cheng Hsu, Taichung (TW); Ya-Lun Wu, Taichung (TW); and Kai-Ching Chen, Taichung (TW)
Assigned to China Medical University, Taichung (TW)
Filed by China Medical University, Taichung (TW)
Filed on Sep. 29, 2021, as Appl. No. 17/488,658.
Claims priority of application No. 110122805 (TW), filed on Jun. 22, 2021.
Prior Publication US 2022/0406296 A1, Dec. 22, 2022
Int. Cl. G10L 15/06 (2013.01); G10L 15/22 (2006.01); G10L 15/26 (2006.01); G10L 21/0208 (2013.01)

CPC G10L 15/063 (2013.01) [G10L 15/22 (2013.01); G10L 15/26 (2013.01); G10L 21/0208 (2013.01)]

2 Claims

1. An intelligent medical speech automatic recognition method, comprising:

performing a first model training step to drive a processing unit to train a generic statement data and a medical statement data of a database to establish a first model;

performing a second model training step to drive the processing unit to train a medical textbook data of the database to establish a second model;

performing a voice receiving step to drive a voice receiver to receive a speech signal, wherein the voice receiver is signally connected to the processing unit;

performing a signal pre-treatment step to drive the processing unit to receive the speech signal from the voice receiver and transform the speech signal into a to-be-recognized speech signal; and

performing a transforming step to drive the processing unit to transform and recognize the to-be-recognized speech signal into a complete sentence writing character according to the first model and the second model;

wherein the generic statement data, the medical statement data and the medical textbook data are different from each other;

wherein the transforming step comprises:

performing a first transforming step to drive the processing unit to transform the to-be-recognized speech signal into a writing character according to the first model; and

performing a second transforming step to drive the processing unit to transform the writing character into the complete sentence writing character according to the second model without transforming the speech signal into a phonography;

wherein the complete sentence writing character comprises at least one punctuation;

wherein the first model is trained to recognize a generic vocabulary and a medical field vocabulary;

wherein the medical statement data comprises a plurality of medical vocabulary speech signals and a plurality of medical vocabulary writing characters corresponding to the medical vocabulary speech signals; and

wherein the medical vocabulary writing characters comprise at least one hybrid vocabulary with Chinese words and English words;

wherein the signal pre-treatment step comprises:

performing a noise filtering step to drive the processing unit to filter out a noise of the speech signal, and generate a human voice interval signal; and

performing a target interval enhancing step to drive the processing unit to enhance the human voice interval signal according to a human voice frequency band, and generate the to-be-recognized speech signal;

wherein the generic statement data comprises a plurality of generic vocabulary speech signals and a plurality of generic vocabulary writing characters corresponding to the generic vocabulary speech signals; and

wherein the first model training step, the second model training step, the voice receiving step, the signal pre-treatment step, the first transforming step and the second transforming step of the transforming step are performed in sequence.