US 12,119,115 B2
Systems and methods for self-supervised learning based on naturally-occurring patterns of missing data
Luca Foschini, Santa Barbara, CA (US); Filip Jankovic, Santa Barbara, CA (US); Raghunandan Melkote Kainkaryam, Cincinnati, OH (US); Juan Ignacio Oguiza Mendez, Valencia (ES); and Arinbjörn Kolbeinsson, Berlin (DE)
Assigned to Evidation Health, Inc., San Mateo, CA (US)
Filed by Evidation Health, Inc., San Mateo, CA (US)
Filed on Jan. 18, 2023, as Appl. No. 18/156,010.
Claims priority of provisional application 63/412,054, filed on Sep. 30, 2022.
Claims priority of provisional application 63/306,447, filed on Feb. 3, 2022.
Prior Publication US 2023/0245777 A1, Aug. 3, 2023
Int. Cl. G16H 50/20 (2018.01); G16H 50/70 (2018.01)
CPC G16H 50/20 (2018.01) [G16H 50/70 (2018.01)] 30 Claims
OG exemplary drawing
 
1. A method comprising:
(a) identifying, based at least in part on one or more demographics of a target user, a population from a larger group of users, wherein the population comprises one or more digital twins of the target user, and wherein the population is characterized by having a common demographic;
(b) accessing, by a machine learning system, a set of data records for a plurality of users of the population, the set of data records representative of physical statistics measured for each user of the plurality of users of the population over a time period, wherein the physical statistics for each user of the plurality of users of the population are measured using a wearable device associated with each user of the plurality of users of the population;
(c) generating a set of masked data records by masking at least a subset of the set of data records in accordance with a pattern of missingness from the set of data records, wherein the pattern of missingness from the set of data records corresponds to periods of disuse or deactivation of the wearable device associated with each user of the plurality of users of the population;
(d) generating, by the machine learning system, a plurality of learned representations from at least the set of masked data records; and
(e) fine tuning, by the machine learning system, a machine learning model using the plurality of learned representations, the machine learning model configured to perform a downstream machine learning task that comprises imputing missing data from a wearable device associated with the target user, thereby generating complete data for the target user.