US 12,266,434 B2
System and method for inverse summarization of medical records with data augmentation and knowledge distillation
Zhongkai Fu, Bellevue, WA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing LLC, Redmond, WA (US)
Filed on Sep. 23, 2022, as Appl. No. 17/951,742.
Prior Publication US 2024/0105296 A1, Mar. 28, 2024
Int. Cl. G16H 15/00 (2018.01); G06F 18/214 (2023.01); G06F 40/279 (2020.01)
CPC G16H 15/00 (2018.01) [G06F 18/2148 (2023.01); G06F 40/279 (2020.01)] 17 Claims
OG exemplary drawing
 
1. A computer-implemented method, executed on a computing device, comprising:
generating a first synthetic dataset including a synthetic transcription and a corresponding natural dictation record using a first machine learning model trained to generate transcriptions from medical records;
generating a second synthetic dataset including a synthetic medical record and a corresponding natural transcription using a second machine learning model trained to generate medical records from transcriptions;
combining the first synthetic dataset and the second synthetic dataset with a natural dataset into a synthetic training dataset;
training a third machine learning model to generate a medical record from a transcription using the synthetic training dataset by knowledge distillation from the first machine learning model and the second machine learning model to the third machine learning model for generating the synthetic medical record from the transcriptions, wherein the third machine learning model is smaller than the first machine learning model and the second machine learning model; and
generating a medical record from a transcription using the third machine learning model.