US 12,266,096 B2
Systems and methods for processing images of slides to infer biomarkers
Supriya Kapur, New York, NY (US); Ran Godrich, New York, NY (US); Christopher Kanan, Rochester, NY (US); Thomas Fuchs, New York, NY (US); and Leo Grady, Darien, CT (US)
Assigned to Paige.AI, Inc., New York, NY (US)
Filed by PAIGE.AI, Inc., New York, NY (US)
Filed on Sep. 9, 2020, as Appl. No. 17/016,048.
Claims priority of provisional application 62/897,734, filed on Sep. 9, 2019.
Prior Publication US 2021/0073986 A1, Mar. 11, 2021
Int. Cl. G06T 7/00 (2017.01); G06F 18/214 (2023.01); G06T 7/11 (2017.01); G06V 20/69 (2022.01); G16H 10/40 (2018.01); G16H 30/40 (2018.01); G16H 50/20 (2018.01)
CPC G06T 7/0012 (2013.01) [G06F 18/214 (2023.01); G06T 7/11 (2017.01); G06V 20/695 (2022.01); G06V 20/698 (2022.01); G16H 10/40 (2018.01); G16H 30/40 (2018.01); G16H 50/20 (2018.01); G06T 2207/10056 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/30024 (2013.01); G06V 2201/03 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for analyzing an image corresponding to a specimen, the method comprising:
receiving a target electronic image corresponding to a target specimen, the target specimen comprising a tissue sample of a patient;
applying a first machine learning algorithm to the target electronic image to identify one or more salient regions of the target electronic image;
applying a second machine learning algorithm focusing on the one or more salient regions of the target electronic image to identify a region of interest of the target specimen and determine a numeric expression level of, category of, and a numeric probability that there is a presence of a biomarker in the region of interest of the target specimen, the biomarker comprising at least one from among an epithelial growth factor receptor (EGFR) biomarker and/or a DNA mismatch repair (MMR) deficiency biomarker, the second machine learning algorithm having been generated by processing a plurality of training images to predict whether a region of interest is present in the target electronic image, the training images comprising images of human tissue and/or images that are algorithmically generated; wherein determining the numeric expression level of the biomarker includes:
breaking the one or more salient regions of target electronic image into a plurality of tiles;
determining, by the second machine learning algorithm, the numeric expression level corresponding to a prediction for each tile to determine a plurality of tile predictions; and
aggregating the plurality of tile predictions into at least one part level numeric expression level prediction for sub-regions of the one or more salient regions, wherein the at least one part level numeric expression level prediction and prediction for each tile are fed into the second machine learning algorithm to determine the numeric expression level prediction; and
outputting the determined numeric expression level of, category of, and presence of the biomarker in the region of interest.