US 11,749,260 B1
Method for speech recognition with grapheme information
Hwanbok Mun, Seoul (KR); Dongchan Shin, Seoul (KR); Gyujin Kim, Incheon (KR); Seongmin Park, Seoul (KR); and Jihwa Lee, Seoul (KR)
Assigned to ACTIONPOWER CORP., Seoul (KR)
Filed by ActionPower Corp., Seoul (KR)
Filed on Sep. 23, 2022, as Appl. No. 17/952,072.
Claims priority of application No. 10-2022-0078703 (KR), filed on Jun. 28, 2022.
Int. Cl. G10L 15/06 (2013.01); G10L 15/02 (2006.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01); G10L 15/197 (2013.01)
CPC G10L 15/063 (2013.01) [G10L 15/02 (2013.01); G10L 15/16 (2013.01); G10L 15/197 (2013.01); G10L 15/22 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A method for speech recognition performed by a computing device, the method comprising:
inputting voice information into an encoder to extract a first feature vector and calculating a first loss function;
inputting the first feature vector extracted from the encoder to a first decoder to perform prediction on the voice information, calculating a second loss function, and extracting a second feature vector;
inputting the second feature vector extracted from the first decoder to a second decoder to perform grapheme-based prediction, and calculating a third loss function;
calculating a final loss function based on the first loss function, the second loss function, and the third loss function; and
training at least one of the encoder, the first decoder, or the second decoder to decease the calculated final loss function.