US 12,444,191 B2
Background audio construction
Yi Zhang, Shanghai (CN)
Assigned to Shanghai Hode Information Technology Co., Ltd., Shanghai (CN)
Filed by Shanghai Hode Information Technology Co., Ltd., Shanghai (CN)
Filed on Apr. 12, 2023, as Appl. No. 18/133,641.
Application 18/133,641 is a continuation of application No. PCT/CN2021/120377, filed on Sep. 24, 2021.
Claims priority of application No. 202011437857.1 (CN), filed on Dec. 10, 2020.
Prior Publication US 2023/0245451 A1, Aug. 3, 2023
Int. Cl. G06V 20/40 (2022.01); G06F 16/783 (2019.01)
CPC G06V 20/41 (2022.01) [G06F 16/7834 (2019.01); G06V 20/46 (2022.01)] 18 Claims
OG exemplary drawing
 
1. A method, comprising:
performing semantic segmentation on to-be-processed video data to generate a corresponding semantic segmentation map, and extracting a semantic segmentation feature of the to-be-processed video data based on the semantic segmentation map;
extracting an audio feature of each audio file in a pre-established audio set; and
aligning the audio feature and the semantic segmentation feature, selecting a target audio file from the audio set based on an alignment result, and constructing background audio for the to-be-processed video data based on the target audio file,
wherein the aligning the audio feature and the semantic segmentation feature comprises:
performing dimension scaling processing on the audio feature and the semantic segmentation feature based on a preset feature dimension, to generate a target audio feature and a target semantic segmentation feature; and
aligning the target audio feature and the target semantic segmentation feature.