| CPC G10L 19/008 (2013.01) [G10L 21/043 (2013.01)] | 14 Claims |

|
1. A sound signal purification method for obtaining, a purified decoded sound signal representing a sound signal of a channel of stereo,
the sound signal purification method comprising:
a decoded sound common signal estimation step of obtaining, for each frame, a decoded sound common signal YM that is a signal common to all channels of the stereo by using at least all of one or more and N or less n-th channel decoded sound signals Xn, wherein n represents each integer of 1 or more and N or less, the n-th channel decoded sound signal Xn represents a decoded sound signal of the each channel of the stereo, the n-th channel decoded sound signal Xn is obtained by decoding a stereo code CS and a monaural decoded sound signal XM represents a monaural decoded sound signal, the monaural decoded sound signal XM is obtained by decoding a monaural code CM that is a code different from the stereo code CS, the n-th channel decoded sound signal Xn is obtained by decoding the stereo code CS without using either information obtained by decoding the monaural code CM or the monaural code CM;
a decoded sound common signal upmixing step of obtaining, for the each frame, an n-th channel upmixed common signal YMn, wherein the n-th channel upmixed common signal YMn is obtained by upmixing the decoded sound common signal YM for the each channel by an upmixing process using the decoded sound common signal YM and inter-channel relationship information, and the inter-channel relationship information indicates a relationship between the channels of the stereo;
a monaural decoded sound upmixing step of obtaining, for the each frame, an n-th channel upmixed monaural decoded sound signal XMn, wherein the n-th channel upmixed monaural decoded sound signal XMn is obtained by upmixing the monaural decoded sound signal XM for the each channel by an upmixing process using the monaural decoded sound signal XM and the inter-channel relationship information;
an n-th channel signal purification step of obtaining, for the each frame and for each corresponding sample t with respect to the each channel n, a sequence based on a value ˜yMn(t)=(1−αMn)× yMn(t)+αMn× xMn(t) as an n-th channel purified upmixed signal ˜YMn, wherein the value ˜yMn(t) is obtained by adding a value αMn× XMn(t) and a value (1−αMn)× yMn(t), the value αMn× XMn(t) is obtained by multiplying an n-th channel purification weight αMn by a sample value xMn(t) of the n-th channel upmixed monaural decoded sound signal XMn, the value (1−αMn)× yMn(t) is obtained by multiplying a value (1−αMn) by a sample value yMn(t) of the n-th channel upmixed common signal YMn, and the value (1−αMn) is obtained by subtracting the n-th channel purification weight αMn from 1;
an n-th channel separation combination weight estimation step of obtaining, for the each frame with respect to the each channel n, a normalized inner product value for the n-th channel upmixed common signal YMn of the n-th channel decoded sound signal Xn as an n-th channel separation combination weight βn; and
an n-th channel separation combination step of obtaining, for the each frame and for each corresponding sample t with respect to the each channel n, a sequence based on a value ˜xn(t)= xn(t)−βn× yMn(t)+βnטyMn (t) as an n-th channel purified decoded sound signal ˜Xn, wherein the value ˜xn(t) is obtained by subtracting a value βn× yMn(t) from a sample value xn(t) of the n-th channel decoded sound signal Xn and adding a value βnטyMn(t), the value βn× yMn(t) is obtained by multiplying the n-th channel separation combination weight βn by the sample value yMn(t) of the n-th channel upmixed common signal YMn, and the value βnטyMn(t) is obtained by multiplying the n-th channel separation combination weight βn by a sample value ˜yMn(t) of the n-th channel purified upmixed signal ˜YMn.
|