US 12,272,168 B2
	Systems and methods for processing machine learning language model classification outputs via text block masking
Joel Stremmel, Iowa City, IA (US); Eran Halperin, Santa Monica, CA (US); and Brian Hill, Culver City, CA (US)
Assigned to UNITEDHEALTH GROUP INCORPORATED, Minnetonka, MN (US)
Filed by UnitedHealth Group Incorporated, Minnetonka, MN (US)
Filed on Oct. 14, 2022, as Appl. No. 18/046,831.
Claims priority of provisional application 63/362,902, filed on Apr. 13, 2022.
Prior Publication US 2023/0334887 A1, Oct. 19, 2023
Int. Cl. G06V 30/414 (2022.01); G06F 40/284 (2020.01); G06V 30/19 (2022.01); G06V 30/413 (2022.01)

CPC G06V 30/414 (2022.01) [G06F 40/284 (2020.01); G06V 30/19173 (2022.01); G06V 30/413 (2022.01)]

20 Claims

1. A computer-implemented method for processing document classification system outputs, the computer-implemented method comprising:

generating, using one or more processors, an unmasked label probability score, of one or more unmasked label probability scores, for each of one or more classification labels based at least in part on one or more document data objects;

for each document data object of the one or more document data objects:

segmenting, using the one or more processors, the document data object into a plurality of text blocks;

performing, using the one or more processors and a document classification machine learning model, a classification of the document data object via one or more classification routine iterations, wherein each of the one or more classification routine iterations is configured to:

(i) generate one or more masked text blocks by masking one or more text blocks of the plurality of text blocks,

(ii) generate, using the document classification machine learning model, per-masked document classification of the document data object, based at least in part on the masking of the one or more masked text blocks, and

(iii) generate one or more per-iteration masked label probability scores based at least in part on the one or more masked text blocks absent from the document data object, wherein each of the one or more per-iteration masked label probability scores correspond to a particular classification label of the one or more classification labels and is associated with one or more of the one or more masked text blocks;

for each masked text block of the one or more masked text blocks:

generating, using the one or more processors, one or more per-label text block importance scores based at least in part on a corresponding one of the one or more unmasked label probability scores and each of the one or more per-iteration masked label probability scores associated with the masked text block;

generating, using the one or more processors, a predictive data output for the document data object based at least in part on the one or more per-label text block importance scores; and

performing, using the one or more processors, one or more prediction-based actions based at least in part on the predictive data output for the one or more document data objects.