US 12,105,754 B2
Audio identification based on data structure
Zafar Rafii, Berkeley, CA (US); and Prem Seetharaman, Chicago, IL (US)
Assigned to Gracenote, Inc., New York, NY (US)
Filed by Gracenote, Inc., Emeryville, CA (US)
Filed on Jan. 8, 2024, as Appl. No. 18/406,840.
Application 18/406,840 is a continuation of application No. 16/927,577, filed on Jul. 13, 2020, granted, now 11,907,288.
Application 16/927,577 is a continuation of application No. 15/698,532, filed on Sep. 7, 2017, granted, now 10,713,296, issued on Jul. 14, 2020.
Claims priority of provisional application 62/385,574, filed on Sep. 9, 2016.
Prior Publication US 2024/0160665 A1, May 16, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/68 (2019.01); G06F 16/61 (2019.01); G06F 17/14 (2006.01); G10L 25/27 (2013.01); G10L 25/51 (2013.01)
CPC G06F 16/686 (2019.01) [G06F 16/61 (2019.01); G06F 17/14 (2013.01); G10L 25/27 (2013.01); G10L 25/51 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable medium comprising instructions that, when executed, cause one or more processors to perform a set of operations comprising:
binarizing one or more constant Q transformed time slices of query audio;
generating two-dimensional Fourier transforms of time windows within the binarized one or more constant Q transformed time slices;
ordering the two-dimensional Fourier transforms in a query data structure; and
identifying the query audio as a cover rendition of reference audio based on a comparison between the query data structure and a reference data structure associated with the reference audio.