US 12,354,614 B2
Speech coding method and apparatus for performing the same
Woo-taek Lim, Sejong-si (KR); Seung Kwon Beack, Daejeon (KR); Inseon Jang, Daejeon (KR); Jongmo Sung, Daejeon (KR); Tae Jin Lee, Daejeon (KR); Byeongho Cho, Daejeon (KR); Minje Kim, Bloomington, IN (US); and Haici Yang, Bloomington, IN (US)
Assigned to ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, Daejeon (KR); and THE TRUSTEES OF INDIANA UNIVERSITY, Indianapolis, IN (US)
Filed by Electronics and Telecommunications Research Institute, Daejeon (KR); and The Trustees of Indiana University, Indianapolis, IN (US)
Filed on Sep. 26, 2023, as Appl. No. 18/474,997.
Claims priority of provisional application 63/420,438, filed on Oct. 28, 2022.
Claims priority of application No. 10-2023-0102244 (KR), filed on Aug. 4, 2023.
Prior Publication US 2024/0013796 A1, Jan. 11, 2024
Int. Cl. G10L 19/07 (2013.01); G10L 19/038 (2013.01)
CPC G10L 19/038 (2013.01) [G10L 19/07 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A method of encoding a speech signal, the method comprising:
predicting a feature vector of each of a plurality of frames comprised of the speech signal based on a ground-truth feature vector of a previous frame of each of the plurality of frames;
calculating a residual signal corresponding to each of the plurality of frames based on a ground-truth feature vector of each of the plurality of frames and a predicted feature vector of each of the plurality of frames; and
generating a bitstring corresponding to each of the plurality of frames by quantizing the residual signal,
wherein the generating of the bitstring comprises:
determining a threshold value related to energy of the residual signals, based on a target bitrate for the bitstring; and
applying a first quantization scheme or a second quantization scheme to the residual signal based on the residual signal and the threshold value.