| CPC G10L 15/1815 (2013.01) [G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/26 (2013.01); G10L 15/30 (2013.01)] | 20 Claims |

|
14. A server for generating textual representations of a user utterance, the user utterance being collected by an electronic device, communicatively coupled with the server, associated with a user, the user utterance being provided in response to machine-generated utterances outputted by the electronic device, the server comprising:
a processor;
a non-transitory computer-readable medium storing instructions; and
the processor, upon executing the instructions, being configured to:
acquire, from the electronic device, an audio signal being an audio representation of the user utterance,
the user utterance being in response to a given machine-generated utterance previously outputted by the electronic device to the user;
acquire, a machine-generated text string being a textual representation of the given machine-generated utterance; and
generate, using a Speech-to-Text (STT) model, another text string based on the audio signal and the machine-generated text string,
the another text string being a textual representation of the user utterance while taking into account the machine-generated text string as a context of the user utterance.
|