| CPC G16B 40/00 (2019.02) [G06N 3/04 (2013.01); G06N 3/088 (2013.01); G06N 7/01 (2023.01)] | 18 Claims |

|
1. A method comprising:
obtaining a neural network that has been trained to determine a likelihood that read pileup windows provided as input are representative of variants, wherein the neural network is produced by:
obtaining a plurality of read pileup windows associated with a first sample genome,
wherein each read pileup window of the plurality of read pileup windows is associated with a different reference nucleotide position within the first sample genome,
wherein each read pileup window of the plurality of read pileup windows includes sequence reads generated using a particular read process, and
wherein a given read pileup window of the plurality of read pileup windows includes a plurality of sequence reads that each include a nucleotide aligned at a given reference nucleotide position, within the first sample genome, that is associated with the given read pileup window;
obtaining, for each reference nucleotide position that is associated with a read pileup window within the plurality of read pileup windows, a label that indicates whether the reference nucleotide position is either (i) a known variant or (ii) a non-variant; and
training the neural network, based on data indicative of the plurality of read pileup windows and the labels;
receiving, as input, a read pileup window that is associated with a second sample genome and that includes sequence reads generated using the particular read process; and
applying the neural network to the read pileup window to produce an output that is representative of a likelihood that the read pileup window associated with the second sample genome is representative of a variant.
|