US 11,676,685 B2
	Artificial intelligence-based quality scoring
Kishore Jaganathan, San Francisco, CA (US); John Randall Gobbel, Brisbane, CA (US); and Amirali Kia, San Mateo, CA (US)
Assigned to Illumina, Inc., San Diego, CA (US)
Filed by Illumina, Inc., San Diego, CA (US)
Filed on Mar. 20, 2020, as Appl. No. 16/826,134.
Claims priority of provisional application 62/821,602, filed on Mar. 21, 2019.
Claims priority of provisional application 62/821,618, filed on Mar. 21, 2019.
Claims priority of provisional application 62/821,681, filed on Mar. 21, 2019.
Claims priority of provisional application 62/821,724, filed on Mar. 21, 2019.
Claims priority of provisional application 62/821,766, filed on Mar. 21, 2019.
Claims priority of application No. 2023310 (NL), filed on Jun. 14, 2019; application No. 2023311 (NL), filed on Jun. 14, 2019; application No. 2023312 (NL), filed on Jun. 14, 2019; and application No. 2023314 (NL), filed on Jun. 14, 2019.
Prior Publication US 2020/0327377 A1, Oct. 15, 2020
Int. Cl. G06K 9/00 (2022.01); G16B 40/20 (2019.01); G06N 3/08 (2023.01); G16B 40/00 (2019.01); G06N 3/04 (2023.01); G06F 16/907 (2019.01); G06N 3/084 (2023.01); G06V 10/82 (2022.01); G06F 18/23 (2023.01); G06F 18/24 (2023.01); G06F 18/213 (2023.01); G06F 18/214 (2023.01); G06F 18/21 (2023.01); G06F 18/2415 (2023.01); G06F 18/2431 (2023.01); G06F 18/23211 (2023.01); G06N 7/01 (2023.01); G06V 10/762 (2022.01); G06V 10/764 (2022.01); G06V 10/77 (2022.01); G06V 10/778 (2022.01); G06V 10/44 (2022.01); G06V 10/98 (2022.01); G06N 5/046 (2023.01)

CPC G16B 40/20 (2019.02) [G06F 16/907 (2019.01); G06F 18/213 (2023.01); G06F 18/214 (2023.01); G06F 18/217 (2023.01); G06F 18/23 (2023.01); G06F 18/23211 (2023.01); G06F 18/24 (2023.01); G06F 18/2415 (2023.01); G06F 18/2431 (2023.01); G06N 3/04 (2013.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01); G06N 7/01 (2023.01); G06V 10/454 (2022.01); G06V 10/763 (2022.01); G06V 10/764 (2022.01); G06V 10/7715 (2022.01); G06V 10/7784 (2022.01); G06V 10/82 (2022.01); G06V 10/993 (2022.01); G16B 40/00 (2019.02); G06N 5/046 (2013.01)]

20 Claims

1. A computer-implemented method of quality scoring for a set of base calls, including:

processing input data for one or more analytes through a neural network-based base caller and producing an alternative representation of the input data;

processing the alternative representation through an output layer to produce an output, wherein the output identifies likelihoods of a base incorporated in a particular one of the analytes being A, C, T, and G;

calling bases for one or more of the analytes based on the output; and

determining quality scores of the called bases based on the likelihoods identified by the output by:

quantizing classification scores of base calls produced by the neural network-based base caller in response to processing training data during training;

selecting a set of quantized classification scores;

for each quantized classification score in the set of quantized classification scores, determining a base calling error rate by comparing predicted base calls for the set of quantized classification scores to corresponding ground truth base calls;

determining a fit between the set of quantized classification scores and corresponding base calling error rates; and

correlating the quality scores to the set of quantized classification scores based on the fit.