CPC G10L 15/1815 (2013.01) [G06F 3/011 (2013.01); G06F 16/685 (2019.01); G06N 20/00 (2019.01); G10L 21/10 (2013.01); G10L 25/63 (2013.01)] | 27 Claims |
14. A device comprising:
one or more processors;
a non-transitory memory;
a speaker;
a display; and
one or more programs stored in the non-transitory memory, which, when executed by the one or more processors, cause the device to:
obtain a first audio file and a second audio file;
parse the first audio file into a plurality of first segments;
parse the second audio file into a plurality of second segments;
generate, for each of the plurality of first segments and each of the plurality of second segments, segment metadata;
determine a relationship between first segment metadata of one of the plurality of first segments and second segment metadata of one of the plurality of second segments;
generate computer-generated reality (CGR) content associated with the one of the plurality of first segments and the one of the plurality of second segments based on the relationship, the first segment metadata, and the second segment metadata; and
display the CGR content on the display by overlaying a virtual object onto a pass-through representation of a physical environment of the device when the device is concurrently playing the one of the plurality of first segments and the one of the plurality of second segments via the speaker.
|