US 12,334,098 B2
Methods, apparatus, and systems for detection and extraction of spatially-identifiable subband audio sources
Aaron Steven Master, San Francisco, CA (US); Lie Lu, Dublin, CA (US); and Harald Mundt, Fürth (DE)
Assigned to Dolby Laboratories Licensing Corporation, San Francisco, CA (US); and DOLBY INTERNATIONAL AB, Dublin (IE)
Appl. No. 18/009,501
Filed by Dolby Laboratories Licensing Corporation, San Francisco, CA (US); and DOLBY INTERNATIONAL AB, Dublin (IE)
PCT Filed Jun. 11, 2021, PCT No. PCT/US2021/036900
§ 371(c)(1), (2) Date Dec. 9, 2022,
PCT Pub. No. WO2021/252823, PCT Pub. Date Dec. 16, 2021.
Claims priority of provisional application 63/038,048, filed on Jun. 11, 2020.
Claims priority of application No. 20179447 (EP), filed on Jun. 11, 2020.
Prior Publication US 2023/0245671 A1, Aug. 3, 2023
Int. Cl. G10L 21/0272 (2013.01)
CPC G10L 21/0272 (2013.01) 19 Claims
OG exemplary drawing
 
1. A method comprising:
transforming, using one or more processors, one or more frames of a two-channel time domain audio signal into a time-frequency domain representation including a plurality of time-frequency tiles, wherein the frequency domain of the time-frequency domain representation includes a plurality of frequency bins grouped into a plurality of subbands;
for each time-frequency tile:
calculating, using the one or more processors, spatial parameters and a level for the time-frequency tile;
modifying, using the one or more processors, the spatial parameters using shift and squeeze parameters;
obtaining, using the one or more processors, a softmask value for each frequency bin using the modified spatial parameters, the level and subband information; and
applying, using the one or more processors, the softmask values to the time-frequency tile to generate a modified time-frequency tile of an estimated audio source.