CPC G16B 40/20 (2019.02) [G06F 9/3877 (2013.01); G06F 18/2148 (2023.01); G06F 18/2431 (2023.01); G06N 3/04 (2013.01); G06N 3/045 (2023.01); G06N 3/084 (2013.01); G16B 20/00 (2019.02); G16B 20/20 (2019.02); G16B 40/00 (2019.02)] | 20 Claims |
1. A system comprising:
at least one processor; and
a non-transitory computer readable medium storing a convolutional neural network and instructions that, when executed by the at least one processor, cause the system to:
identify a group of reads aligned with a reference genome and spanning a candidate variant at a target base position;
provide, to the convolutional neural network, an array of input features generated from a text file comprising sequencing data output by a sequencer instrument, the array of input features encoding:
bases from the group of reads in the text file at the target base position,
bases flanking each side of the target base position in the text file, and
corresponding base features for bases within the group of reads; and
generate, based on an analysis of the array of input features by the convolutional neural network, classification scores indicating likelihoods that the candidate variant at the target base position is a variant.
|