US 11,736,775 B1
Artificial intelligence audio descriptions for live events
Adam M. Balest, Lake Forest Park, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Dec. 9, 2021, as Appl. No. 17/643,569.
Int. Cl. H04N 21/81 (2011.01); H04N 21/2187 (2011.01); G06V 20/40 (2022.01); H04N 21/84 (2011.01)
CPC H04N 21/8106 (2013.01) [G06V 20/42 (2022.01); G06V 20/46 (2022.01); H04N 21/2187 (2013.01); H04N 21/84 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
obtaining, during a live sporting event, video content of the live sporting event;
identifying a plurality of visual elements within a sequence of video frames of the video stream using one or more machine learning models;
determining a set of semantic representations based on the plurality of visual elements, wherein the set of semantic representations describes a player throwing a ball;
generating an audio description based on the semantic representations;
encoding the video content in parallel with generating the audio description;
encoding audio content of the live sporting event;
packaging the encoded video content, encoded audio content, and audio descriptions;
providing a manifest having references to the encoded video content, encoded audio content, and audio descriptions to a client device;
receiving a request for the video content and the audio description from the client device; and
providing the video content and the audio description to the client device.