US 11,900,958 B2
Method and system for processing speech signal
Shiliang Zhang, Hangzhou (CN); Ming Lei, Beijing (CN); Wei Li, Beijing (CN); and Haitao Yao, Beijing (CN)
Assigned to Alibaba Group Holding Limited, Grand Cayman (KY)
Filed by ALIBABA GROUP HOLDING LIMITED, Grand Cayman (KY)
Filed on Dec. 26, 2022, as Appl. No. 18/146,440.
Application 18/146,440 is a continuation of application No. 16/698,536, filed on Nov. 27, 2019, granted, now 11,538,488.
Claims priority of application No. 201811457674.9 (CN), filed on Nov. 30, 2018.
Prior Publication US 2023/0245672 A1, Aug. 3, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 25/12 (2013.01); G10L 15/16 (2006.01); G10L 15/187 (2013.01); G10L 25/30 (2013.01)
CPC G10L 25/12 (2013.01) [G10L 15/16 (2013.01); G10L 15/187 (2013.01); G10L 25/30 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for processing a speech signal, comprising:
obtaining a speech signal;
generating, using the speech signal, a first sequence of feature vectors;
selecting, from the first sequence of feature vectors, a second sequence of m*n consecutive feature vectors;
generating a third sequence of n intermediate vectors by applying the second sequence to a first neural network model, each intermediate vector corresponding to a subsequence of m consecutive feature vectors in the first sequence;
generating a fourth sequence of n average probability vectors by applying each of the n intermediate vectors to a corresponding second neural network model; and
determining the phones in the second sequence using the fourth sequence of n average probability vectors.