US 12,260,622 B2
	Systems, methods, and apparatuses for the generation of source models for transfer learning to application specific models used in the processing of medical imaging
Zongwei Zhou, Tempe, AZ (US); Vatsal Sodha, San Jose, CA (US); Md Mahfuzur Rahman Siddiquee, Tempe, AZ (US); Ruibin Feng, Scottsdale, AZ (US); Nima Tajbakhsh, Los Angeles, CA (US); and Jianming Liang, Scottsdale, AZ (US)
Assigned to Arizona Board of Regents on behalf of Arizona State University, Scottsdale, AZ (US)
Appl. No. 17/625,313
Filed by Arizona Board of Regents on behalf of Arizona State University, Scottsdale, AZ (US)
PCT Filed Jul. 17, 2020, PCT No. PCT/US2020/042560 § 371(c)(1), (2) Date Jan. 6, 2022, PCT Pub. No. WO2021/016087, PCT Pub. Date Jan. 28, 2021.
Claims priority of provisional application 62/876,502, filed on Jul. 19, 2019.
Prior Publication US 2022/0262105 A1, Aug. 18, 2022
Int. Cl. G06V 10/82 (2022.01); G06V 10/774 (2022.01); G06V 10/776 (2022.01); G06V 10/98 (2022.01)

CPC G06V 10/7747 (2022.01) [G06V 10/776 (2022.01); G06V 10/82 (2022.01); G06V 10/98 (2022.01); G06V 2201/03 (2022.01)]

21 Claims

1. A system comprising:

a memory to store instructions;

a set of one or more processors to execute the instructions stored in the memory to:

identify a group of unlabeled and unannotated training samples, wherein each unlabeled and unannotated training sample includes a medical image of a selected anatomical region of a body of a patient;

for each unlabeled and unannotated training sample in the group of unlabeled and unannotated training samples:

identify a patch that is a portion of the medial image corresponding to the unlabeled and unannotated training sample;

identify one or more transformations to be applied to the patch; and

generate a transformed patch by applying the one or more transformations to the patch; and

train a source model comprising an encoder-decoder network to learn anatomical patterns from the medical images of the selected anatomical region in a self-supervised manner using a group of transformed patches corresponding to the group of unlabeled and unannotated training samples, and without using labeled or annotated training samples, wherein the encoder-decoder network is trained to generate an approximation of the patch from a corresponding transformed patch, and wherein the encoder-decoder network is trained to minimize a loss function that indicates a difference between the generated approximation of the patch and the patch.

9. A non-transitory computer-readable medium containing computer executable instructions that, when executed by a processor, cause the processor to perform a method comprising:

identifying a group of unlabeled and unannotated training samples, wherein each unlabeled and unannotated training sample includes a medical image of a selected anatomical region;

for each unlabeled and unannotated training sample in the group of unlabeled and unannotated training samples:

identifying a patch that is a portion of the medical image corresponding to the unlabeled and unannotated training sample;

identifying one or more transformations to be applied to the patch; and

generating a transformed patch by applying the one or more transformations to the patch; and

training a source model comprising an encoder-decoder network to learn anatomical patterns from the medical images of the selected anatomical region in a self-supervised manner using a group of transformed patches corresponding to the group of unlabeled and unannotated training samples, and without using labeled or annotated training samples, wherein the encoder-decoder network is trained to generate an approximation of the patch from a corresponding transformed patch, and wherein the encoder-decoder network is trained to minimize a loss function that indicates a difference between the generated approximation of the patch and the patch.