US 12,462,790 B2
Quality estimation for automatic speech recognition
Kai Fan, Sunnyvale, CA (US); Bo Li, Hangzhou (CN); and Jiayi Wang, Hangzhou (CN)
Assigned to Alibaba Group Holding Limited, Grand Cayman (KY)
Filed by Alibaba Group Holding Limited, Grand Cayman (KY)
Filed on Jul. 20, 2023, as Appl. No. 18/224,514.
Application 18/224,514 is a continuation of application No. PCT/CN2021/073073, filed on Jan. 21, 2021.
Prior Publication US 2023/0360636 A1, Nov. 9, 2023
Int. Cl. G10L 15/01 (2013.01); G10L 15/06 (2013.01)
CPC G10L 15/01 (2013.01) [G10L 15/063 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
training a transformer learning model on inputs comprising a sequence of audio tokens, the trained transformer learning model being executable by one or more processors of a computing system to output a sequence of feature representations; and
training a quality estimation learning model on inputs comprising the sequence of feature representations, the trained quality estimation learning model being executable by the one or more processors to output a probability of a word error rate value.