US 11,928,880 B1
Frame-based body part detection in video clips
Xiaohang Sun, Seattle, WA (US); Mohamed Kamal Omar, Seattle, WA (US); Alexander Ratnikov, Redmond, WA (US); Ahmed Aly Saad Ahmed, Bothell, WA (US); Tai-Ching Li, Issaquah, WA (US); Travis Silvers, Lynnwood, WA (US); Hanxiao Deng, Bellevue, WA (US); Muhammad Raffay Hamid, Seattle, WA (US); and Ivan Ryndin, Shoreline, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 29, 2021, as Appl. No. 17/215,816.
Int. Cl. G06V 40/10 (2022.01); G06F 18/21 (2023.01); G06N 3/08 (2023.01); G06V 20/40 (2022.01)
CPC G06V 40/10 (2022.01) [G06F 18/2178 (2023.01); G06N 3/08 (2013.01); G06V 20/46 (2022.01)] 19 Claims
OG exemplary drawing
 
15. One or more non-transitory computer-readable storage media comprising computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising:
receiving video content including a first frame, the first frame showing an uncovered portion of a body of a person, the uncovered portion being one of a plurality of uncovered body part types, the uncovered portion comprising an external, uncovered portion of the body of the person;
determining a plurality of scores, respectively, for frames of the video content;
selecting the first frame based at least in part on determining that the first frame is associated with a maximum score of the plurality of scores of the frames of the video content;
receiving, by a machine learning model, the first frame of the video content, the machine learning model trained based at least in part on a loss function that penalizes the machine learning model for incorrectly identifying that a region of a map associated with a frame is associated with one of the plurality of uncovered body part types;
determining, by the machine learning model, a score indicating a likelihood that the first frame shows at least one of the plurality of uncovered body part types, wherein the score is the maximum score; and
determining, based at least in part on the maximum score, whether a portion of the video content that includes the first frame will be presented on a display based at least in part on the score.