US 12,260,856 B2
Method and system for recognizing a user utterance
Vasily Alekseevich Ershov, Sankt-Peterburg (RU); and Igor Evgenevich Kuralenok, Sankt-Peterburg (RU)
Assigned to Y.E. Hub Armenia LLC, Yerevan (AM)
Filed by YANDEX EUROPE AG, Lucerne (CH)
Filed on Dec. 14, 2022, as Appl. No. 18/081,634.
Claims priority of application No. RU2021138538 (RU), filed on Dec. 23, 2021.
Prior Publication US 2023/0206910 A1, Jun. 29, 2023
Int. Cl. G10L 15/30 (2013.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/18 (2013.01); G10L 15/26 (2006.01)
CPC G10L 15/1815 (2013.01) [G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/26 (2013.01); G10L 15/30 (2013.01)] 20 Claims
OG exemplary drawing
 
14. A server for generating textual representations of a user utterance, the user utterance being collected by an electronic device, communicatively coupled with the server, associated with a user, the user utterance being provided in response to machine-generated utterances outputted by the electronic device, the server comprising:
a processor;
a non-transitory computer-readable medium storing instructions; and
the processor, upon executing the instructions, being configured to:
acquire, from the electronic device, an audio signal being an audio representation of the user utterance,
the user utterance being in response to a given machine-generated utterance previously outputted by the electronic device to the user;
acquire, a machine-generated text string being a textual representation of the given machine-generated utterance; and
generate, using a Speech-to-Text (STT) model, another text string based on the audio signal and the machine-generated text string,
the another text string being a textual representation of the user utterance while taking into account the machine-generated text string as a context of the user utterance.