US 12,444,193 B2
	Target object recognition
Baohan Xu, Shanghai (CN); and Peiyi Li, Shanghai (CN)
Assigned to Shanghai Hode Information Technology Co., Ltd., Shanghai (CN)
Filed by Shanghai Hode Information Technology Co., Ltd., Shanghai (CN)
Filed on Apr. 7, 2023, as Appl. No. 18/131,993.
Application 18/131,993 is a continuation of application No. PCT/CN2021/120387, filed on Sep. 24, 2021.
Claims priority of application No. 202011529196.5 (CN), filed on Dec. 22, 2020.
Prior Publication US 2023/0281990 A1, Sep. 7, 2023
Int. Cl. G06V 20/40 (2022.01); G06T 7/174 (2017.01); G06V 10/40 (2022.01)

CPC G06V 20/46 (2022.01) [G06T 7/174 (2017.01); G06V 10/40 (2022.01); G06T 2207/30242 (2013.01)]

15 Claims

1. A method, comprising:

receiving a to-be-processed video;

extracting i video frames from the to-be-processed video as initial pictures based on a preset extraction rule, wherein i∈[1, n]. and i is a positive integer;

inputting a received initial picture into a first detection model to obtain an initial location of each of one or more target objects in the initial picture, comprising:

inputting a received i^thinitial picture into the first detection model to obtain the initial location of each of the one or more target objects in the i^thinitial picture;

inputting a candidate picture corresponding to the initial location into a second detection model to obtain a verification object in the candidate picture and a verification location of the verification object in the candidate picture;

adjusting the initial location of each of the one or more target objects based on the verification location to obtain a target location of each of the one or more target objects; and

inputting a target picture corresponding to the target location into a recognition model to obtain the one or more target objects in the initial picture, comprising:

inputting the target picture corresponding to the target location into the recognition model to obtain the one or more target objects in the i^thinitial picture;

determining whether i is greater than n;

in response to determining that i is greater than n, counting the one or more target objects in each initial picture; and

in response to determining that i is not greater than n, increasing i by 1, and continuing to input the received i^thinitial picture into the first detection model.