US 12,445,663 B1
	Apparatus and methods for a large language model with semantic audio for targeted advertising video stream
Yassine Maalej, Aurora, CO (US)
Assigned to Charter Communications Operating, LLC, St. Louis, MO (US)
Filed by Charter Communications Operating, LLC, St. Louis, MO (US)
Filed on Apr. 12, 2024, as Appl. No. 18/634,692.
Int. Cl. H04N 21/234 (2011.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G06Q 30/0251 (2023.01)

CPC H04N 21/23424 (2013.01) [G06F 40/284 (2020.01); G06F 40/30 (2020.01); G06Q 30/0251 (2013.01)]

27 Claims

1. A method performed by a processing system in a network server computing device for inserting contextually relevant advertisements into a video stream, the method comprising:

receiving a primary video stream;

extracting an audio segment from the received primary video stream in response to detecting an SCTE35/SCTE104 marker in the received primary video stream;

extracting audio segments from a plurality of sources of secondary video content that include potential advertisements for insertion into the primary video stream;

using speech recognition technology to transcribe spoken words from the audio segment extracted from the received primary video stream and the audio segments extracted from the plurality of sources of the secondary video content into textual data;

using the textual data to query a large generative artificial intelligence model (LXM) to perform semantic analysis, generate semantic analysis results, and tokenize the generated semantic analysis results into tokenized data that includes sentences or semantically coherent units;

transforming the tokenized data into vector embeddings;

determining semantic similarity scores between the vector embeddings of the audio segment extracted from the received primary video stream and the audio segments extracted from the plurality of sources of the secondary video content;

using the determined semantic similarity scores to identify an advertisement in the secondary video content that has the highest semantic similarity score in relation to the audio segment extracted from the received primary video stream; and

inserting the identified advertisement into the primary video stream at an advertisement breakpoint indicated by the SCTE35/SCTE104 marker.