US 12,079,311 B2
AI-enhanced data labeling
Carlos Andres Esteva, Mountain View, CA (US); and Douwe Stefan van der Wal, Amsterdam (NL)
Assigned to Salesforce, Inc., San Francisco, CA (US)
Filed by Salesforce, Inc., San Francisco, CA (US)
Filed on Jan. 8, 2021, as Appl. No. 17/145,123.
Prior Publication US 2022/0222484 A1, Jul. 14, 2022
Int. Cl. G06T 7/11 (2017.01); G06F 18/214 (2023.01); G06F 18/2413 (2023.01); G06F 18/40 (2023.01); G06N 3/08 (2023.01); G06T 7/00 (2017.01); G06V 10/25 (2022.01); G06V 10/40 (2022.01)
CPC G06F 18/2413 (2023.01) [G06F 18/214 (2023.01); G06F 18/40 (2023.01); G06N 3/08 (2013.01); G06T 7/0012 (2013.01); G06T 7/11 (2017.01); G06V 10/25 (2022.01); G06V 10/40 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30024 (2013.01); G06V 2201/03 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A method for computer-assisted annotation of image data corresponding to an image comprising one or more biological structures, the method comprising:
segmenting, by a machine-learned segmentation model, the image data into image portions, wherein each of the image portions comprise at least one of the one or more biological structures, by:
selecting, based on a stain type that the at least one of the one or more biological structures in the image is stained with, the machine-learned segmentation model from a plurality of available models,
wherein the machine-learned segmentation model is trained using a supervised learning technique with one or more manually segmented images; and
applying the selected machine-learned segmentation model to the image data to identify the image portions;
receiving user-generated labels for a first subset of the image portions while user-generated labels for a second subset of the image portions are unavailable;
training a machine-learned classifier using the labeled first subset of the image portions;
applying the machine-learned classifier trained using the first subset of the image portions to a second subset of the image portions to generate recommended labels for at least some of the second subset of the image portions,
wherein the second subset of the image portions and the first subset of the image portions belong to the same image;
labeling the second subset of the image portions based on user input accepting the recommended labels without user-generated labels for the second subset of the image portions; and
in response to a threshold number of the second subset of the image portions being labeled, retraining the machine-learned classifier using the labeled second subset of the image portions that are labeled by the recommended labels directly generated by the trained machine-learned classifier and accepted by the user input without user-generated labels for the second subset of the image portions.