US 12,417,470 B1
Machine learning systems for optimizing audio advertisements
Daniel Neil MacTiernan, Ocean City, NJ (US); Rohit Bhatia, San Carlos, CA (US); and Laurence Benjamin Linietsky, Montclair, NJ (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Dec. 9, 2022, as Appl. No. 18/064,197.
Int. Cl. G06Q 30/0241 (2023.01); G06Q 30/0251 (2023.01); H04N 21/233 (2011.01); H04N 21/439 (2011.01)
CPC G06Q 30/0251 (2013.01) [H04N 21/233 (2013.01); H04N 21/4394 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
one or more processors with associated memory that implement an audio ad delivery system, configured to:
receive, via a programmatic interface, a group of audio ad files from an advertiser for a particular product or service;
process the audio ad files using one or more trained audio processing models to extract ad features about individual ones of the audio ads, including respective types of calls-to-action (CTAs) used by the audio ads identified in audio content of the audio ad files using a speech recognition model;
send, via one or more audio ad servers, the audio ad files to different consumer engagement systems and under different listening contexts to play the audio ad files to create ad impressions;
receive user conversion results of the ad impressions from the consumer engagement systems;
train a machine learning model to learn conversion patterns for respective types of CTAs used by the audio ads in the different listening contexts;
update the machine learning model based on observation data associated with the ad impressions collected by one or more audio ad servers, the observation data including the ad features, user features of the users, context attributes of the listening contexts, and the user conversion results; and
automatically optimize delivery of subsequent audio ads, comprising deliver, via the one or more audio ad servers, additional plays of the audio ad files, wherein the one or more audio ad servers use the conversion patterns for respective types of CTAs in the different listening contexts to select particular versions of the audio ads for the additional plays in particular listening contexts of the listening contexts.