US 11,758,206 B1
Encoding media content for playback compatibility
Yongjun Wu, Bellevue, WA (US); Alex Zhang, Seattle, WA (US); Kyle Bradley Koceski, Seattle, WA (US); Matthew Scharr, Portland, OR (US); Viriya Ratanasangpunth, Portland, OR (US); and Michael Kale, Portland, OR (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 12, 2021, as Appl. No. 17/249,781.
Int. Cl. H04N 21/2368 (2011.01); H04N 21/2365 (2011.01); G10L 19/16 (2013.01); H04N 21/234 (2011.01); H04N 21/2387 (2011.01); H04N 21/643 (2011.01); H04N 21/236 (2011.01)
CPC H04N 21/2365 (2013.01) [G10L 19/167 (2013.01); H04N 21/2368 (2013.01); H04N 21/2387 (2013.01); H04N 21/23424 (2013.01); H04N 21/23611 (2013.01); H04N 21/643 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer program product, comprising one or more non-transitory computer-readable media having computer program instructions stored therein, the computer program instructions being configured such that, when executed by one or more computing devices, the computer program instructions cause the one or more computing devices to:
encode video content of a media presentation using the High Efficiency Video Coding codec to generate a video component of the media presentation, the video component having a video duration associated therewith;
encode audio content of the media presentation using the Advanced Audio Coding codec to generate an audio component of the media presentation, the audio component having an audio duration associated therewith;
if it is determined that the audio duration is shorter than video duration, add one or more audio frames to the audio component such that the audio duration is greater than or equal to the video duration, and the audio duration comprises an upper limit that is dynamically determined based at least on the video duration and a respective duration of each of the one or more audio frames, the respective duration being based on a quantity of audio samples divided by a sampling frequency, wherein the upper limit is enforced responsive to a cumulative audio delay being at or exceeding a prescribed threshold;
if it is determined that the audio duration is longer than the video duration, determine whether the audio duration is longer than the upper limit;
if it is determined that the audio duration is longer than the upper limit, remove one or more audio frames from the audio component such that the audio duration is greater than or equal to the video duration, and the audio duration comprises the upper limit that is dynamically determined based at least on the video duration and the respective duration of each of the one or more audio frames; and
package the video component and the audio component as parts of the media presentation using a fragmented MP4 container.