US 11,836,952 B2
Enhanced user experience through bi-directional audio and visual signal generation
Sunando Sengupta, Reading (GB); Alexandros Neofytou, London (GB); Eric Chris Wolfgang Sommerlade, Oxford (GB); and Yang Liu, Reading (GB)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Apr. 26, 2021, as Appl. No. 17/240,510.
Prior Publication US 2022/0343543 A1, Oct. 27, 2022
Int. Cl. G06K 9/00 (2022.01); G06T 9/00 (2006.01); G06T 3/60 (2006.01); G10L 19/012 (2013.01); G10L 25/51 (2013.01); G06F 18/21 (2023.01); G10L 19/00 (2013.01)
CPC G06T 9/00 (2013.01) [G06F 18/21 (2023.01); G06T 3/60 (2013.01); G10L 19/012 (2013.01); G10L 25/51 (2013.01); G10L 2019/0002 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method for creating an output signal based on an input signal of different modality, the method comprising:
receiving the input signal in a first modality;
encoding the input signal;
translating the encoded input signal to an encoded output signal using a trained model, wherein the trained model is trained based on visual signals, augmented visual signals, audio signals and augmented audio signals;
decoding the encoded output signal to create the output signal, the output signal of a second modality which is different than the first modality; and
presenting the output signal in connection with the input signal.