CPC G06F 3/015 (2013.01) [A61B 3/113 (2013.01)] | 24 Claims |
1. A system for multimodal machine-aided comprehension analysis, the system comprising:
one or more processors and associated memory, the memory being a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions, the one or more processors perform operations of:
generating an initial scene graph of a scene proximate a user based on an image of the scene, the initial scene graph having one or more subjects and objects, with subject labels, item labels, and relationship labels;
tracking eye movements of the user as the user gazes upon the subject labels, item labels, and relationship labels;
generating a resulting scene graph based on the eye movements of the user and an amount of time the user spends gazing upon each of the subject labels, item labels and relationship labels, the resulting scene graph connecting the subject labels, item labels and relationship labels as relationship triplets;
generating a comprehension model by estimating a user's comprehension of the relationship triplets in the image based on the user's gaze data;
generating a knowledge model based on a known knowledge graph and the comprehension model, the knowledge model specifying the user's background knowledge level and comprehension level; and
generating a cue and presenting the cue to the user.
|