US 12,475,872 B2
	Audio processing method, audio processing system, and computer-readable medium
Yoshifumi Mizuno, Hamamatsu (JP); Yu Takahashi, Hamamatsu (JP); Kazunobu Kondo, Toyohashi (JP); and Kenji Ishizuka, Hamamatsu (JP)
Assigned to YAMAHA CORPORATION, Hamamatsu (JP)
Filed by YAMAHA CORPORATION, Hamamatsu (JP)
Filed on Mar. 24, 2022, as Appl. No. 17/703,697.
Application 17/703,697 is a continuation of application No. PCT/JP2020/035723, filed on Sep. 23, 2020.
Claims priority of application No. 2019-177965 (JP), filed on Sep. 27, 2019; application No. 2019-177966 (JP), filed on Sep. 27, 2019; and application No. 2019-177967 (JP), filed on Sep. 27, 2019.
Prior Publication US 2022/0215822 A1, Jul. 7, 2022
Int. Cl. G10H 3/12 (2006.01); G10H 1/00 (2006.01)

CPC G10H 3/125 (2013.01) [G10H 1/0008 (2013.01); G10H 2210/056 (2013.01); G10H 2210/066 (2013.01); G10H 2250/025 (2013.01)]

20 Claims

1. A computer-implemented audio processing method comprising:

obtaining a plurality of observed envelopes of picked-up sound signals including a first observed envelope representing a contour of a first sound signal picked up in a vicinity of a first sound source and a second observed envelope representing a contour of a second sound signal picked up in a vicinity of a second sound source, wherein:

the first sound signal includes a first target sound from the first sound source and a second spill sound from the second sound source, and

the second sound signal includes a second target sound from the second sound source and a first spill sound from the first sound source;

generating, based on the plurality of observed envelopes, a plurality of output envelopes using a mix matrix including a mix proportion of the second spill sound in the first sound signal and a mix proportion of the first spill sound in the second sound signal, wherein the generated plurality of output envelopes include:

a first output envelope representing a contour of the first target sound in the first observed envelope; and

a second output envelope representing a contour of the second target sound in the second observed envelope; and

displaying on a display device an analysis image representing a level of the second spill sound in the first observed envelope based on the mix matrix and the plurality of output envelopes.