CPC H04N 7/157 (2013.01) [G06F 3/1446 (2013.01); G06T 13/40 (2013.01); G06V 10/40 (2022.01); G06V 40/172 (2022.01); G06V 40/174 (2022.01); G10L 15/02 (2013.01); G10L 17/00 (2013.01); G10L 25/63 (2013.01); H04N 7/141 (2013.01)] | 20 Claims |
1. A method comprising:
generating, by at least one processor and based on an avatar model, a data stream including at least one image of a face associated with the avatar model, audio data associated with a speech of the avatar model, and a rotation instruction;
transmitting, by the at least one processor, the data stream to a three-dimensional video call system, the three-dimensional video call system including:
a stand;
an axle extended from the stand;
a controller;
at least one acoustic sensor coupled with the controller and configured to sense an ambient acoustic signal in an ambient environment;
a video camera coupled with the controller and configured to capture an ambient video signal in the ambient environment;
at least one actuator coupled with the controller and configured to rotate the axle; and
a plurality of display devices attached to the axle and communicatively coupled with the controller, wherein the controller is configured to:
cause a display device of the plurality of display devices to:
display a portion of the at least one image of the face, thereby causing the plurality of display devices to display a three-dimensional image of the face; and
play back the audio data;
cause the at least one actuator to rotate the axle according to the rotation instruction; and
analyze the ambient video signal and the ambient acoustic signal to obtain at least one environmental feature; and
transmit the at least one environmental feature to the at least one processor.
|