US 12,461,992 B2
	Machine learning models for automated processing of transcription database entries
Akash Dwivedi, Chicago, IL (US); Christopher R. Markson, Hawthorne, NJ (US); and Pritesh J. Shah, Paramus, NJ (US)
Assigned to Evernorth Strategic Development, Inc., St. Louis, MO (US)
Filed by Evernorth Strategic Development, Inc., St. Louis, MO (US)
Filed on Feb. 21, 2024, as Appl. No. 18/583,164.
Application 18/583,164 is a continuation of application No. 17/464,213, filed on Sep. 1, 2021, granted, now 11,947,629.
Prior Publication US 2024/0193231 A1, Jun. 13, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 18/214 (2023.01); G06F 16/383 (2019.01); G06F 40/295 (2020.01); G10L 15/26 (2006.01)

CPC G06F 18/2148 (2023.01) [G06F 16/383 (2019.01); G06F 40/295 (2020.01); G10L 15/26 (2013.01); G06F 2218/10 (2023.01)]

20 Claims

14. A computerized method for automated processing of transcription database entries, the method comprising:

joining at least a portion of multiple call transcription data entries with at least a portion of multiple agent call log data entries according to timestamps associated with the entries to generate a set of joined call data entries, wherein a transcription database stores the multiple call transcription data entries, wherein a call database stores the multiple agent call log data entries, and wherein the set of joined call data entries includes the call transcription data entries paired with respective ones of the agent call log data entries;

for at least one of the set of joined call data entries: generating an input vector, based on the joined call data entry, for an unsupervised machine learning model; and

for each of at least a portion of the input vectors:

supplying the input vector to the unsupervised machine learning model to assign an output topic classification of the model to the joined call data entry associated with the input vector,

supplying the input vector to at least one sub-topic model associated with the output topic classification to assign one or more sub-topic output classifications to the joined call data entry associated with the input vector,

modifying a user interface of a user device to display the output topic classification;

wherein the transcription database includes multiple word confidence score data entries associated with each call transcription data entry, and

the method further comprising for each of the set of joined call entries:

preprocessing the joined call data entry according to the word confidence score data entries associated with the call transcription data entry to generate preprocessed text, wherein the preprocessing includes removing or replacing at least a portion of transcribed text of the call transcription data entry, and

performing natural language processing vectorization on the preprocessed text to generate the input vector for the unsupervised machine learning model.