US 12,456,492 B2
Multimedia data recording method and device
Gang Ma, Beijing (CN); and Bo Liu, Beijing (CN)
Assigned to LENOVO (BEIJING) LIMITED, Beijing (CN)
Filed by Lenovo (Beijing) Limited, Beijing (CN)
Filed on Feb. 5, 2024, as Appl. No. 18/432,914.
Claims priority of application No. 202310115797.9 (CN), filed on Feb. 8, 2023.
Prior Publication US 2024/0312487 A1, Sep. 19, 2024
Int. Cl. G11B 27/036 (2006.01); G06V 20/40 (2022.01); G10L 17/02 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01)
CPC G11B 27/036 (2013.01) [G06V 20/41 (2022.01); G10L 17/02 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A multimedia data recording method comprising:
performing real-time analysis on multimedia data to obtain voice content and a demonstration action of a target object, the multimedia data including first audio data and image frame data that are simultaneously collected;
determining whether the demonstration action is semantically consistent with the voice content, wherein the demonstration action is semantically consistent with the voice content when a first similarity between text content corresponding to the demonstration action and text content corresponding to the voice content is greater than a threshold;
in response to the demonstration action being semantically inconsistent with the voice content, performing video understanding on an image frame corresponding to the demonstration action, to convert the demonstration action to second audio data; and
dynamically inserting the second audio data into the first audio data to update the multimedia data.