CPC G06F 16/906 (2019.01) [G06F 16/9014 (2019.01); G06F 16/90344 (2019.01); G16B 30/00 (2019.02); G16B 40/00 (2019.02); G16B 50/30 (2019.02)] | 20 Claims |
1. A system, comprising:
one or more processors coupled to memory, the one or more processors configured to:
access a hash table that stores a plurality of first k-mers of a human genome, each first k-mer of the plurality of first k-mers corresponding to a first number of characters (k);
generate a plurality of second k-mers of a read of a cluster of a plurality of clusters, the plurality of clusters obtained from a sample;
determine a second number of the plurality of second k-mers that match at least one of the plurality of first k-mers in the hash table;
generate a subset of the plurality of clusters by removing the cluster from the plurality of clusters responsive to the second number satisfying a threshold; and
transmit the subset of the plurality of clusters to a computing system.
|