| CPC G16B 40/20 (2019.02) [G06N 3/044 (2023.01); G06N 3/045 (2023.01); G16B 15/20 (2019.02); G16B 20/20 (2019.02); G16B 30/00 (2019.02)] | 20 Claims |

|
1. A system including memory and one or more processors operable to execute instructions, stored in the memory, to perform operations comprising:
utilizing a voxelizer to access a three-dimensional structure of a reference amino acid sequence of a protein, and fits a three-dimensional grid of voxels on atoms in the three-dimensional structure on an amino acid-basis to generate amino acid-wise distance channels,
wherein each of the amino acid-wise distance channels has a three-dimensional distance value for each voxel in the three-dimensional grid of voxels, and
wherein the three-dimensional distance value specifies a distance from a corresponding voxel in the three-dimensional grid of voxels to atoms of a corresponding reference amino acid in the reference amino acid sequence;
utilizing an alternative allele encoder to encode an alternative allele amino acid to each voxel in the three-dimensional grid of voxels,
wherein the alternative allele amino acid is a three-dimensional representation of a one-hot encoding of a variant amino acid expressed by a variant nucleotide;
utilizing an evolutionary conservation encoder to encode an evolutionary conservation sequence to each voxel in the three-dimensional grid of voxels,
wherein the evolutionary conservation sequence is a three-dimensional representation of amino acid-specific conservation frequencies across a plurality of species, and
wherein the amino acid-specific conservation frequencies are selected in dependence upon amino acid proximity to the corresponding voxel; and
utilizing a tensor generator to generate a tensor that includes the amino acid-wise distance channels encoded with the alternative allele amino acid and respective evolutionary conservation sequences.
|