US 12,299,399 B2
	Method, apparatus, and system for recognizing text in image
Lin Du, Beijing (CN); Alfred Chixiong Shen, Shenzhen (CN); and Lemeng Pan, Shenzhen (CN)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Filed by Huawei Technologies Co., Ltd., Guangdong (CN)
Filed on May 10, 2022, as Appl. No. 17/740,718.
Application 17/740,718 is a continuation of application No. PCT/CN2020/133807, filed on Dec. 4, 2020.
Claims priority of application No. 201911391341.5 (CN), filed on Dec. 30, 2019.
Prior Publication US 2022/0262151 A1, Aug. 18, 2022
Int. Cl. G06F 40/30 (2020.01); G06F 40/295 (2020.01); G06V 10/774 (2022.01); G06V 10/778 (2022.01); G06V 10/80 (2022.01); G06V 10/82 (2022.01); G06V 30/14 (2022.01); G06V 30/18 (2022.01); G06V 30/262 (2022.01); G06V 30/412 (2022.01); G06V 30/416 (2022.01)

CPC G06F 40/30 (2020.01) [G06F 40/295 (2020.01); G06V 10/774 (2022.01); G06V 10/778 (2022.01); G06V 10/806 (2022.01); G06V 10/82 (2022.01); G06V 30/1444 (2022.01); G06V 30/18057 (2022.01); G06V 30/274 (2022.01); G06V 30/412 (2022.01); G06V 30/416 (2022.01)]

20 Claims

1. A method for recognizing a to-be-recognized text in an image, comprising:

obtaining a plurality of recognition results of the to-be-recognized text in the image according to a plurality of recognition methods;

training a first machine learning model based on a plurality of first training samples, labels of the plurality of first training samples, a plurality of second training samples obtained after partial information of each of the plurality of first training samples is masked, and masked information;

obtaining semantic information of the plurality of recognition results based on the first machine learning model;

obtaining feature information of the image, wherein the feature information of the image represents information around the to-be-recognized text in the image; and

determining a target recognition result of the to-be-recognized text from the plurality of recognition results based on the feature information of the image and the semantic information of the plurality of recognition results.