US 12,367,890 B2
Audio source separation and audio dubbing
Stefan Uhlich, Stuttgart (DE); Giorgio Fabbro, Stuttgart (DE); Marc Ferras Font, Stuttgart (DE); Falk-Martin Hoffmann, Stuttgart (DE); and Thomas Kemp, Stuttgart (DE)
Assigned to Sony Group Corporation, Tokyo (JP)
Appl. No. 17/925,025
Filed by Sony Group Corporation, Tokyo (JP)
PCT Filed Mar. 17, 2021, PCT No. PCT/EP2021/056828
§ 371(c)(1), (2) Date Nov. 14, 2022,
PCT Pub. No. WO2021/239285, PCT Pub. Date Dec. 2, 2021.
Claims priority of application No. 20177598 (EP), filed on May 29, 2020.
Prior Publication US 2023/0186937 A1, Jun. 15, 2023
Int. Cl. G10L 21/028 (2013.01); G10H 1/36 (2006.01); G10L 13/04 (2013.01); G10L 15/08 (2006.01); G10L 15/22 (2006.01); G10L 25/18 (2013.01); H04S 3/00 (2006.01)
CPC G10L 21/028 (2013.01) [G10H 1/361 (2013.01); G10L 13/04 (2013.01); G10L 15/08 (2013.01); G10L 15/22 (2013.01); G10L 25/18 (2013.01); G10H 2210/005 (2013.01); G10L 2015/088 (2013.01); H04S 3/008 (2013.01); H04S 2400/01 (2013.01)] 14 Claims
OG exemplary drawing
 
1. An electronic device, comprising:
circuitry configured to
perform audio source separation on an audio input signal to obtain a separated source and a residual signal, wherein the residual signal includes audio content from the audio input signal other than the separated source,
delay the separated source by a first delay amount to obtain a delayed separated source, wherein the first delay amount corresponds to a processing latency of phrase detection,
delay the residual signal by a second delay amount to obtain a delayed residual signal, wherein the second delay amount corresponds to a combined processing latency of the phrase detection and audio dubbing,
perform audio dubbing on the separated source based on replacement conditions to obtain a personalized separated source, and
mix the personalized separated source with the delayed residual signal to obtain a personalized audio signal.