US 12,013,863 B2
Multi-resolution modeling of discrete stochastic processes for computationally-efficient information search and retrieval
Bonnie Berger Leighton, Newtonville, MA (US); Maxwell Aaron Sherman, Marblehead, MA (US); and Adam Uri Yaari, Cambridge, MA (US)
Filed by Bonnie Berger Leighton, Newtonville, MA (US); Maxwell Aaron Sherman, Marblehead, MA (US); and Adam Uri Yaari, Cambridge, MA (US)
Filed on Apr. 18, 2022, as Appl. No. 17/722,793.
Application 17/722,793 is a continuation of application No. 17/459,346, filed on Aug. 27, 2021, granted, now 11,308,101, issued on Apr. 19, 2022.
Claims priority of provisional application 63/080,694, filed on Sep. 19, 2020.
Prior Publication US 2022/0374425 A1, Nov. 24, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/2458 (2019.01); G06N 3/08 (2023.01); G16B 5/20 (2019.01); G16B 20/20 (2019.01); G16B 40/00 (2019.01)
CPC G06F 16/2462 (2019.01) [G06N 3/08 (2013.01); G16B 5/20 (2019.02); G16B 20/20 (2019.02); G16B 40/00 (2019.02)] 3 Claims
OG exemplary drawing
 
1. An information search and retrieval apparatus for use with an input dataset that is a non-stationary discrete stochastic process, wherein events of interest occur within the process approximately independently at units i within a region R, and with an unknown rate λR that is approximately constant across R, wherein λR has an associated estimation uncertainty defined by a set of parameters that include expectation μR and variance σR2, and wherein the non-stationary discrete stochastic process comprising a plurality of regions, comprising:
a hardware processor; and
computer memory holding computer program code executed by the hardware processor, the computer program code comprising:
a neural network and a Gaussian process that uses the input dataset to one-time train a model to predict rate parameters and their associated estimation uncertainty for each region R of the plurality of regions; and
a query function that receives a query associated with any arbitrary set of indexed positions within the plurality of regions and, in response, scales existing predictions of the rate parameters and their associated estimation uncertainties from the trained model for any region of the plurality of regions to obtain a response to the query, the response being a distribution of the events of interest and their associated estimation uncertainties for the set of indexed positions; and
an output function that applies data derived from the response to a target application;
wherein the target application identifies anomalies in one of: a genome for a cancer of interest, a computer network, and a population.