| CPC G09B 23/28 (2013.01) [G06N 3/08 (2013.01); G06T 7/0012 (2013.01); G06V 10/454 (2022.01); G06V 10/751 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 20/20 (2022.01); G06V 20/41 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30004 (2013.01)] | 13 Claims |

|
1. A method comprising:
generating, with a surgical simulator, simulated surgical videos each representative of a simulation of a surgical scenario;
associating simulated ground truth data from the simulation with the simulated surgical videos, wherein the simulated ground truth data corresponds to context information of at least one of a simulated surgical instrument, a simulated anatomical region, or a simulated surgical task simulated by the surgical simulator;
annotating features of the simulated surgical videos based, at least in part, on the simulated ground truth data to generate annotated training data for training a machine learning model;
pre-training the machine learning model with the annotated training data to probabilistically identify the features:
providing, to a refiner neural network, simulated images from the simulated surgical videos before the pre-training;
refining the simulated surgical videos with the refiner neural network, wherein the refiner neural network adjusts the simulated images until a discriminator neural network determines the simulated images are comparable to unlabeled real images within a first threshold, wherein the features annotated from the simulated surgical videos are included after the refining, and wherein the unlabeled real images are representative of the surgical scenario in a real environment;
receiving unlabeled real videos from a surgical video database, each of the unlabeled real videos representative of the surgical scenario;
identifying the features from the unlabeled real videos with the machine learning model trained to probabilistically identify the features from the unlabeled real videos, wherein the features include at least one of a separation distance between a surgical instrument and an anatomical region, temporal boundaries of a surgical complication, or spatial boundaries of the surgical complication; and
annotating the unlabeled real videos with the machine learning model by labeling the features of the unlabeled real videos identified by the machine learning model.
|