US 11,929,076 B2
User-perceived latency while maintaining accuracy
Hosam Adel Khalil, Issaquah, WA (US); Emilian Stoimenov, Bellevue, WA (US); Christopher Hakan Basoglu, Everett, WA (US); Kshitiz Kumar, Redmond, WA (US); and Jian Wu, Bellevue, WA (US)
Assigned to Microsoft Technology Licensing, LLC., Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Dec. 1, 2022, as Appl. No. 18/060,949.
Application 18/060,949 is a continuation of application No. 17/123,087, filed on Dec. 15, 2020, granted, now 11,532,312.
Prior Publication US 2023/0102295 A1, Mar. 30, 2023
Int. Cl. G10L 15/32 (2013.01); G10L 15/16 (2006.01); G10L 15/30 (2013.01); G10L 19/16 (2013.01); G10L 25/51 (2013.01); G10L 15/08 (2006.01)
CPC G10L 15/32 (2013.01) [G10L 15/16 (2013.01); G10L 15/30 (2013.01); G10L 19/167 (2013.01); G10L 25/51 (2013.01); G10L 2015/088 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving an audio stream, in parallel, by a primary speech recognition engine (SRE) and a secondary SRE;
generating a concatenated output including an encoded output of the secondary SRE and an early-stage encoded output of the primary SRE;
processing the concatenated output by the primary SRE; and
based upon the processing of the concatenated output by the primary SRE, generating a speech recognition result.