US 11,934,406 B2
Digital content data generation systems and methods
Zeidee Pineda, West New York, NJ (US); Nicole Torres, Freehold, NJ (US); Bing Xu, Fort Lee, NJ (US); Nana Yaw Essuman, Dumfries, VA (US); and Hengyu Tang, New York, NY (US)
Assigned to NBCUniversal Media, LLC, New York, NY (US)
Filed by NBCUniversal Media, LLC, New York, NY (US)
Filed on Nov. 19, 2020, as Appl. No. 16/952,908.
Prior Publication US 2022/0156265 A1, May 19, 2022
Int. Cl. G06F 7/00 (2006.01); G06F 7/14 (2006.01); G06F 16/2455 (2019.01); G06N 20/00 (2019.01)
CPC G06F 16/24558 (2019.01) [G06F 7/14 (2013.01); G06N 20/00 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable medium comprising computer readable instructions, that when executed by one or more processors, causes the one or more processors to perform operations comprising:
receiving an input data set related to digital content, wherein the input data set comprises a plurality of input entries corresponding to digital content of a digital content service provider, wherein the digital content comprises a provider-specified digital content identifier;
in response to receiving the input data, match the input data set to a baseline data source having different content identifiers than the provider-specified digital content identifier, by:
matching each input entry of the plurality of input entries to one or more baseline entries of a baseline data set provided by a baseline data source different than the digital content service provider;
assigning a probability score to each respective baseline entry of the one or more baseline entries for each respective input entry based on metadata associated with the input data set, wherein the probability score for each respective baseline entry indicates a probability that the respective baseline entry is an accurate match to the input entry;
filtering out respective baseline entries that do not meet a threshold probability score;
after the filtering:
when one respective baseline entry for the respective input entry remains, automatically matching the respective baseline entry to the respective input;
when more than one respective baseline entry for the respective input entry remains, suggesting a match of a highest probability scored one of the more than one respective baseline entries to the respective input entry; and
when no respective baseline entry for the respective input entry remains, identifying no match to the respective input entry;
causing presentation of a graphical user interface that presents results of the matching of the input data set to the baseline data source, the results comprising the automatic matchings, the suggested matchings, and the no matchings; and
generating an output data set comprising a plurality of output entries based upon the matching of the input data set to the baseline data source, wherein each respective input entry corresponds to a respective output entry of the plurality of output entries, and wherein each respective output entry with a matching baseline entry is configured to provide enhanced querying associated with the digital content, by providing, to a search engine for searching:
the matching baseline entry of the one or more baseline entries that is matched to the respective input entry; and
additional data associated with the respective input entry from the input data set.