US 12,444,195 B1
Audio data selection for video matching using generative artificial intelligence model
Daniel David Dunlap, Jr., Pittsburgh, PA (US); Andrew Jerome Feltenstein, Bozeman, MT (US); and John Amor Nau, Los Angeles, CA (US)
Assigned to Beacon Street Technologies, LLC, Bozeman, MT (US)
Filed by Beacon Street Technologies, LLC, Bozeman, MT (US)
Filed on Mar. 4, 2025, as Appl. No. 19/070,361.
Claims priority of provisional application 63/659,104, filed on Jun. 12, 2024.
Int. Cl. G06V 20/40 (2022.01); G06F 3/04847 (2022.01); G06F 16/438 (2019.01); G06V 10/70 (2022.01); G11B 27/036 (2006.01)
CPC G06V 20/47 (2022.01) [G06F 3/04847 (2013.01); G06F 16/438 (2019.01); G06V 10/70 (2022.01); G06V 20/41 (2022.01); G11B 27/036 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
receiving a video at a video editing system from a client device;
extracting a plurality of key frames from the video using a key frame extraction algorithm;
transmitting a narrative generation prompt to a generative AI model, wherein narrative generation prompt comprises the extracted key frames and text instructions to identify a video narrative for the video;
receiving the video narrative from the generative AI model in a response to the narrative generation prompt;
transmitting a tag generation prompt to the generative AI model with the video narrative and text instructions to generate a plurality of descriptor tags;
receiving the plurality of descriptor tags from the generative AI model in a response to the tag generation prompt;
retrieving a set of songs and a set of scores from an audio tagging system using a query comprising the plurality of descriptor tags;
ranking the retrieved set of songs based on the retrieved set of scores; and
transmitting the ranked set of songs for display to client device through a video editing interface.