US 12,443,824 B2
Systems and methods for providing feedback for artificial intelligence-based image capture devices
Aaron Michael Donsbach, Seattle, WA (US); Christopher Breithaupt, Berkeley, CA (US); Li Zhang, Seattle, WA (US); Arushan Rajasekaram, Seattle, WA (US); and Navid Shiee, Seattle, WA (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Apr. 19, 2024, as Appl. No. 18/641,054.
Application 18/641,054 is a continuation of application No. 17/878,724, filed on Aug. 1, 2022, granted, now 11,995,530.
Application 17/878,724 is a continuation of application No. 17/266,957, granted, now 11,403,509, issued on Aug. 2, 2022, previously published as PCT/US2019/014481, filed on Jan. 22, 2019.
Claims priority of provisional application 62/742,810, filed on Oct. 8, 2018.
Prior Publication US 2024/0273340 A1, Aug. 15, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/082 (2023.01); G06N 3/084 (2023.01); H04N 23/60 (2023.01); H04N 23/611 (2023.01); H04N 23/62 (2023.01); H04N 23/63 (2023.01)
CPC G06N 3/044 (2023.01) [G06N 3/045 (2023.01); G06N 3/084 (2013.01); H04N 23/611 (2023.01); H04N 23/62 (2023.01); H04N 23/632 (2023.01); H04N 23/64 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A computing system, comprising:
an image capture system configured to capture a plurality of image frames;
an artificial intelligence system comprising one or more machine-learned models, the artificial intelligence system configured to analyze each of the plurality of image frames and to output, for each of the plurality of image frames, a respective measure of one or more attributes of a respective scene depicted by the image frame, wherein the one or more machine-learned models comprise at least one of a machine-learned pose detection model and a machine-learned facial expression model;
a display;
one or more processors; and
one or more non-transitory computer-readable media that store instructions that, when executed by the one or more processors, cause the one or more processors to perform operations, the operations comprising:
providing, in a viewfinder portion of a user interface presented on the display, a live video stream that depicts at least a portion of a current field of view of the image capture system, wherein the live video stream comprises the plurality of image frames;
providing, in the viewfinder portion of the user interface presented on the display, a graphical intelligence feedback indicator in association with the live video stream, the graphical intelligence feedback indicator graphically indicating, for each of the plurality of image frames as such image frame is presented within the viewfinder portion of the user interface, the respective measure of the one or more attributes of the respective scene depicted by the image frame output by the artificial intelligence system, wherein the respective measure is determined based in part on detection of one or more poses by the machine-learned pose detection model or on detection of one or more facial expressions by the machine-learned facial expression model; and
in response to the respective measure being greater than or equal to a threshold:
automatically storing a non-temporary copy of the image frame in a memory of the computing system.