US 12,315,520 B2
Sound signal downmixing method, sound signal coding method, sound signal downmixing apparatus, sound signal coding apparatus, program and recording medium
Ryosuke Sugiura, Tokyo (JP); Takehiro Moriya, Tokyo (JP); and Yutaka Kamamoto, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Appl. No. 17/909,677
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
PCT Filed Feb. 8, 2021, PCT No. PCT/JP2021/004640
§ 371(c)(1), (2) Date Sep. 6, 2022,
PCT Pub. No. WO2021/181975, PCT Pub. Date Sep. 16, 2021.
Claims priority of application No. PCT/JP2020/010080 (WO), filed on Mar. 9, 2020; application No. PCT/JP2020/010081 (WO), filed on Mar. 9, 2020; and application No. PCT/JP2020/041216 (WO), filed on Nov. 4, 2020.
Prior Publication US 2023/0106832 A1, Apr. 6, 2023
Int. Cl. H04S 1/00 (2006.01); G10L 19/008 (2013.01); G10L 19/24 (2013.01); H04S 7/00 (2006.01)
CPC G10L 19/008 (2013.01) [G10L 19/24 (2013.01); H04S 1/007 (2013.01); H04S 7/30 (2013.01); H04S 2400/03 (2013.01)] 6 Claims
OG exemplary drawing
 
1. A sound signal downmix method of obtaining a downmix signal that is a monaural sound signal from input sound signals of N channels, N being an integer of three or greater, the sound signal downmix method comprising:
obtaining an inter-channel correlation value and preceding channel information of every pair of two channels included in the N channels, the inter-channel correlation value being a value from 0 to 1 indicating a degree of a correlation between input sound signals of the two channels, the preceding channel information being information indicating which of the input sound signals of the two channels is preceding;
obtaining every sample xM(t) of the downmix signal, wherein
a sample number is denoted as t, every sample of the input sound signal of an ith channel whose i is from 1 to Nis denoted as xi(t), and every sample of the downmix signal is denoted as xM(t),
a set of channel numbers of channels preceding the ith channel is denoted as ILi,
a set of channel numbers of channels succeeding the ith channel is denoted as IFi,
the inter-channel correlation value of a pair of the ith channel and every channel j preceding the ith channel is denoted as γij,
the inter-channel correlation value of a pair of the ith channel and every channel k succeeding the ith channel is denoted as γik,
a weight of the ith channel is denoted as wi, wi being expressed by

OG Complex Work Unit Math
a normalized weight of the ith channel is denoted as w′i, w′i being expressed by

OG Complex Work Unit Math
 and
every sample xM(t) of the downmix signal is obtained by

OG Complex Work Unit Math
encoding, based on the downmixed signal in embedded form focused on a monaural embedding, the plurality of sound signals of N channels.