US 12,217,001 B2
	Natural language processing techniques using multi-context self-attention machine learning frameworks
Mostafa Bayomi, Dublin (IE); Ahmed Selim, Dublin (IE); Kieran O'Donoghue, Dublin (IE); and Michael Bridges, Dublin (IE)
Assigned to Optum Services (Ireland) Limited, Dublin (IE)
Filed by Optum Services (Ireland) Limited, Dublin (IE)
Filed on Apr. 29, 2022, as Appl. No. 17/733,522.
Claims priority of provisional application 63/314,073, filed on Feb. 25, 2022.
Prior Publication US 2023/0306201 A1, Sep. 28, 2023
Int. Cl. G06N 3/0442 (2023.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G06N 3/0464 (2023.01); G06N 3/047 (2023.01); G06N 3/09 (2023.01); G06N 3/0985 (2023.01)

CPC G06F 40/284 (2020.01) [G06F 40/30 (2020.01); G06N 3/047 (2023.01)]

20 Claims

1. A computer-implemented method comprising:

generating, by one or more processors and using a multi-context convolutional self-attention machine learning framework, a cross-context token representation based at least in part on an input text token of an input text sequence wherein:

the multi-context convolutional self-attention machine learning framework comprises a shared token embedding machine learning model, a plurality of context-specific self-attention machine learning models, and a cross-context representation inference machine learning model,

the shared token embedding machine learning model is configured to generate an initial token embedding for the input text token,

a context-specific self-attention machine learning model of the plurality of context-specific self-attention machine learning models is (a) associated with a distinct context window size of a plurality of distinct context window sizes, and (b) configured to generate a context-specific token representation for the input text token based at least in part on the initial token embedding, and

the cross-context representation inference machine learning model is configured to generate the cross-context token representation based at least in part on the context-specific token representation;

generating, by the one or more processors and a natural language processing machine learning model, a natural language processing output for the input text sequence based at least in part on the cross-context token representation; and

initiating, by the one or more processors, the performance of one or more prediction-based actions based at least in part on the natural language processing output.