US 12,334,098 B2
	Methods, apparatus, and systems for detection and extraction of spatially-identifiable subband audio sources
Aaron Steven Master, San Francisco, CA (US); Lie Lu, Dublin, CA (US); and Harald Mundt, Fürth (DE)
Assigned to Dolby Laboratories Licensing Corporation, San Francisco, CA (US); and DOLBY INTERNATIONAL AB, Dublin (IE)
Appl. No. 18/009,501
Filed by Dolby Laboratories Licensing Corporation, San Francisco, CA (US); and DOLBY INTERNATIONAL AB, Dublin (IE)
PCT Filed Jun. 11, 2021, PCT No. PCT/US2021/036900 § 371(c)(1), (2) Date Dec. 9, 2022, PCT Pub. No. WO2021/252823, PCT Pub. Date Dec. 16, 2021.
Claims priority of provisional application 63/038,048, filed on Jun. 11, 2020.
Claims priority of application No. 20179447 (EP), filed on Jun. 11, 2020.
Prior Publication US 2023/0245671 A1, Aug. 3, 2023
Int. Cl. G10L 21/0272 (2013.01)

CPC G10L 21/0272 (2013.01)

19 Claims

1. A method comprising:

transforming, using one or more processors, one or more frames of a two-channel time domain audio signal into a time-frequency domain representation including a plurality of time-frequency tiles, wherein the frequency domain of the time-frequency domain representation includes a plurality of frequency bins grouped into a plurality of subbands;

for each time-frequency tile:

calculating, using the one or more processors, spatial parameters and a level for the time-frequency tile;

modifying, using the one or more processors, the spatial parameters using shift and squeeze parameters;

obtaining, using the one or more processors, a softmask value for each frequency bin using the modified spatial parameters, the level and subband information; and

applying, using the one or more processors, the softmask values to the time-frequency tile to generate a modified time-frequency tile of an estimated audio source.