US 12,149,914 B2
Multi-channel speech compression system and method
Dushyant Sharma, Mountain House, CA (US); Patrick A. Naylor, Reading (GB); and Uwe Helmut Jost, Groton, MA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Feb. 11, 2022, as Appl. No. 17/669,599.
Claims priority of provisional application 63/148,427, filed on Feb. 11, 2021.
Claims priority of provisional application 63/183,848, filed on May 4, 2021.
Prior Publication US 2022/0254361 A1, Aug. 11, 2022
Int. Cl. G10L 15/22 (2006.01); G06T 7/70 (2017.01); G10L 15/06 (2013.01); G10L 19/008 (2013.01); G10L 19/16 (2013.01); G10L 21/0208 (2013.01); H04R 1/40 (2006.01); H04R 3/00 (2006.01); H04R 5/027 (2006.01); H04S 3/00 (2006.01); H04S 7/00 (2006.01); G10L 19/00 (2013.01); G10L 21/0216 (2013.01)
CPC H04S 7/30 (2013.01) [G06T 7/70 (2017.01); G10L 15/063 (2013.01); G10L 15/22 (2013.01); G10L 19/008 (2013.01); G10L 19/167 (2013.01); G10L 21/0208 (2013.01); H04R 1/406 (2013.01); H04R 3/005 (2013.01); H04R 5/027 (2013.01); H04S 3/008 (2013.01); G10L 2019/0001 (2013.01); G10L 2019/0002 (2013.01); G10L 2021/02166 (2013.01); H04R 2201/401 (2013.01); H04S 2400/01 (2013.01); H04S 2400/15 (2013.01)] 14 Claims
OG exemplary drawing
 
11. A computing system comprising:
a memory; and
a processor configured to obtain machine vision encounter information using one or more machine vision systems, wherein the processor is further configured to obtain audio encounter information using a plurality of audio acquisition devices of an audio recording system, wherein the processor is further configured to encode the audio encounter information using one or more codecs, wherein the processor is further configured to adapt the encoding of the audio encounter information by the one or more codecs based upon, at least in part, the machine vision encounter information, wherein the processor is further configured to generate a plurality of acoustic relative transfer functions between the plurality of audio acquisition devices of the audio recording system, and wherein adapting the encoding of the audio encounter information by the one or more codecs based upon, at least in part, the machine vision encounter information includes adapting the one or more codecs to encode the audio encounter information using one or more acoustic relative transfer functions associated with a particular acoustic source when the machine vision encounter information detects the acoustic source.