CPC H04N 21/251 (2013.01) [H04N 21/44204 (2013.01)] | 20 Claims |
1. A system comprising:
a database of television (TV) viewing data comprising program records for a multiplicity of existing TV programs, each program record identifying a respective TV program and including, for the respective TV program, a first set of historical presentation-logistics (PL) features, a second set of content-descriptor (CD) features, and a third set of historical viewer-rating (VR) metrics, wherein the historical PL features comprise information identifying a content-delivery platform that previously sourced the respective TV program for end-user viewing, and specifying a delivery mode used to deliver the respective TV program and a release-schedule drop pattern (RSDP) that was used by the content-delivery platform for viewing availability and/or program delivery, wherein the CD features comprise information characterizing media content of the respective TV program, and wherein the historical VR metrics comprise, for the historical PL features, statistical quantification of viewing performance of the respective TV program among one or more audience categories;
one or more processors; and
memory storing instructions that, when executed by the one or more processors, cause the system to carry out operations including:
receiving a training plurality of program records from the TV viewing data;
for each given program record of at least a subset of the program records of the training plurality, identifying from among the training plurality a most similar TV program based on a quantitative comparison of CD features of the given program record with those of the other program records of the training plurality, wherein the most similar TV program is different from the respective program of the given program record;
based on each given program record and its identified most similar TV program, creating a synthetic program record comprising historical PL features from the given program record, CD features of the most similar TV program, and with historical VR metrics omitted and/or replaced with null values;
by applying an aggregate of the training plurality of program records and the synthetic program records as input and historical VR features of the training plurality of program records as ground-truths, training a machine-learning (ML) model to predict audience performance metrics of the respective TV programs of the training plurality of program records; and
configuring the trained ML model for predicting audience performance metrics of one or more runtime program records respectively associated with hypothetical TV programs not yet available for viewing and/or not yet transmitted.
|