US 12,444,482 B2
Multi-channel protein voxelization to predict variant pathogenicity using deep convolutional neural networks
Tobias Hamp, Cambridge (GB); Kai-How Farh, Hillsborough, CA (US); and Hong Gao, Palo Alto, CA (US)
Assigned to Illumina, Inc., San Diego, CA (US)
Filed by Illumina, Inc., San Diego, CA (US)
Filed on Mar. 24, 2022, as Appl. No. 17/703,935.
Claims priority of provisional application 63/175,767, filed on Apr. 16, 2021.
Claims priority of provisional application 63/175,495, filed on Apr. 15, 2021.
Prior Publication US 2022/0336056 A1, Oct. 20, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G16B 40/20 (2019.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G16B 15/20 (2019.01); G16B 20/20 (2019.01); G16B 30/00 (2019.01)
CPC G16B 40/20 (2019.02) [G06N 3/044 (2023.01); G06N 3/045 (2023.01); G16B 15/20 (2019.02); G16B 20/20 (2019.02); G16B 30/00 (2019.02)] 20 Claims
OG exemplary drawing
 
1. A system including memory and one or more processors operable to execute instructions, stored in the memory, to perform operations comprising:
utilizing a voxelizer to access a three-dimensional structure of a reference amino acid sequence of a protein, and fits a three-dimensional grid of voxels on atoms in the three-dimensional structure on an amino acid-basis to generate amino acid-wise distance channels,
wherein each of the amino acid-wise distance channels has a three-dimensional distance value for each voxel in the three-dimensional grid of voxels, and
wherein the three-dimensional distance value specifies a distance from a corresponding voxel in the three-dimensional grid of voxels to atoms of a corresponding reference amino acid in the reference amino acid sequence;
utilizing an alternative allele encoder to encode an alternative allele amino acid to each voxel in the three-dimensional grid of voxels,
wherein the alternative allele amino acid is a three-dimensional representation of a one-hot encoding of a variant amino acid expressed by a variant nucleotide;
utilizing an evolutionary conservation encoder to encode an evolutionary conservation sequence to each voxel in the three-dimensional grid of voxels,
wherein the evolutionary conservation sequence is a three-dimensional representation of amino acid-specific conservation frequencies across a plurality of species, and
wherein the amino acid-specific conservation frequencies are selected in dependence upon amino acid proximity to the corresponding voxel; and
utilizing a tensor generator to generate a tensor that includes the amino acid-wise distance channels encoded with the alternative allele amino acid and respective evolutionary conservation sequences.