CPC G06T 7/0012 (2013.01) [G16H 10/60 (2018.01); G16H 30/20 (2018.01); G16H 30/40 (2018.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30004 (2013.01); G06T 2207/30196 (2013.01)] | 12 Claims |
1. A medical device configured to transcribe an appearance of a human being, said device comprising:
a computing device comprising a data processor, and
a computer program product for running on the computing device, and when running on said data processor:
receives at least one image from said image capturing sensor;
analyzes said at least one image, the analyzing comprises:
subjecting said at least one image to said first machine learning model;
detecting presence of a living being in said at least one image;
labeling the detected living being in said at least one image using a label;
subjecting at least a part of said at least one image, said part of said at least one image comprising the labeled living being, to said second machine learning model;
retrieving said appearance of said labeled living being from said second machine learning model;
applying said transcription module to transcribe the retrieved appearance of said labeled living being to text, and
outputting said text;
said medical device comprising a common housing holding:
an image capturing sensor;
the computing device comprising a data processor, and
the computer program product
the computer program product comprising:
a first machine learning model trained for detecting and labeling human beings in at least one image;
a second machine learning model trained for detecting appearances of human beings in at least one image;
a transcription module to transcribe the detected appearances of human beings to text,
wherein said computer program product when running on said data processor causes said computing device to:
retrieve at least one image from said image capturing sensor;
analyze said at least one image, the analyzing comprises:
input said at least one image to said first machine learning model;
said first machine learning model detecting presence of a human being in said at least one image;
said first machine learning model labeling the detected human being in said at least one image using a label;
input at least a part of said at least one image to said second machine learning model, said part of said at least one image comprising the labeled human being, and
said second machine learning model providing said appearance of said labeled human being as an output;
apply said transcription module to transcribe the retrieved appearance of said labeled human being to text and outputs said text, wherein the transcription to text in said transcription module involves creating a medical record and output said text into said medical record;
wherein said second machine learning model comprising:
a first deep neural network which captures the skeleton data of said human being in said at least a part of said at least one image, said first deep neural network using said at least a part of said at least one image as an input and outputs said skeleton data;
a second deep neural network which captures a first appearance of said human being, said second deep neural network using said skeleton data from said first deep neural network as an input and outputs said first appearance in first appearance data;
a third deep neural network which captures a second appearance of said human being in said at least a part of said at least one image, said third deep neural network using said at least a part of said at least one image as an input and outputs said second appearance in second appearance data, and
a fourth deep neural network which captures a third appearance of said human being using said first and second appearance data as an input and outputs third appearance data, said third appearance data comprising a prediction of probabilities of said appearance.
|