US 12,002,590 B2
Machine learning-based diagnostic classifier
Monika Sharma Mellem, Falls Church, VA (US); Yuelu Liu, South San Francisco, CA (US); Parvez Ahammad, San Jose, CA (US); Humberto Andres Gonzalez Cabezas, Santa Clara, CA (US); William J. Martin, San Francisco, CA (US); and Pablo Christian Gersberg, San Francisco, CA (US)
Assigned to NEUMORA THERAPEUTICS, INC., Brisbane, CA (US)
Filed by NEUMORA THERAPEUTICS, INC., Brisbane, CA (US)
Filed on May 2, 2023, as Appl. No. 18/311,087.
Application 18/311,087 is a continuation of application No. 17/446,633, filed on Sep. 1, 2021, granted, now 11,676,732.
Application 17/446,633 is a continuation of application No. 16/514,879, filed on Jul. 17, 2019, granted, now 11,139,083.
Application 16/514,879 is a continuation of application No. 16/400,312, filed on May 1, 2019, granted, now 11,715,564.
Claims priority of provisional application 62/665,243, filed on May 1, 2018.
Prior Publication US 2023/0343463 A1, Oct. 26, 2023
Int. Cl. G16H 50/30 (2018.01); G16H 10/60 (2018.01); G16H 50/20 (2018.01)
CPC G16H 50/30 (2018.01) [G16H 10/60 (2018.01); G16H 50/20 (2018.01)] 20 Claims
OG exemplary drawing
 
1. A system for evaluating a user, the system comprising:
a microphone;
a camera positioned to capture an image of the user and configured to output video data;
a memory containing machine readable medium comprising machine executable code having stored thereon instructions for performing a method of evaluating the user; and
a control system coupled to the memory comprising one or more processors, the control system configured to execute the machine executable code to cause the control system to:
record, by the camera, a set of test video data during a time window;
record, by the microphone, a set of test audio data during the time window;
assign a plurality of pixels to a face of the user in the video data;
determine, based on the plurality of pixels, whether the face of the user is within a frame captured by the camera;
in response to determining that the face of the user is within the frame captured by the camera, output video features associated with the user by processing the plurality of pixels;
identify sounds representing a voice of the user and output audio features associated with the user by processing the audio data;
process, using a neural network, the audio and video features, wherein the neural network was previously trained with training data in an unsupervised manner, the training data comprising audio and video data recorded from a plurality of individuals; and
output an indication of whether the user has at least one of a plurality of characteristics based on the processed audio and video features.