US 12,205,601 B1
Content recognition using fingerprinting
David McGuire, Medford, MA (US); Ahmed Abdelal, North Andover, MA (US); Sai Kiran Venkata Subramanya Rupanagudi, Burien, WA (US); Sumit Garg, Acton, MA (US); Terrence Yu, Quincy, MA (US); Nathaniel White, Aldie, VA (US); Siddharth Agrawal, Hanover, NH (US); Pavas Kant, Winchester, MA (US); Yuxuan Hao, Natick, MA (US); Nagaraj Mahajan, Allston, MA (US); Ameya Agaskar, Bedford, MA (US); and Aaron Challenner, Melrose, MA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jun. 29, 2022, as Appl. No. 17/853,183.
Int. Cl. G10L 19/018 (2013.01); G06F 21/62 (2013.01); G06V 20/40 (2022.01); G11B 27/34 (2006.01); H04R 3/00 (2006.01)
CPC G10L 19/018 (2013.01) [G06F 21/6218 (2013.01); G06V 20/46 (2022.01); G11B 27/34 (2013.01); H04R 3/00 (2013.01); G06V 2201/10 (2022.01); H04R 2420/09 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving first encoded input data corresponding to first content to be output by an output device, the first content corresponding to at least one of audio or video;
processing the first encoded input data to determine first decoded data configured to be sent by at least one media interface component to at least one output component for presentation to a user of the output device;
processing the first decoded data to determine first data corresponding to an extracted representation of a first portion of the first decoded data;
determining first metadata corresponding to the first portion;
sending, to the at least one output component of the output device via the at least one media interface component, the first portion of decoded data for playback; and
sending, to a second component of a system that is separate from the output device, the first data and the first metadata.