US 12,333,853 B2
	Face parsing method and related devices
Yinglu Liu, Beijing (CN); Hailin Shi, Beijing (CN); and Tao Mei, Beijing (CN)
Assigned to BEIJING JINGDONG SHANGKE INFORMATION TECHNOLOGY CO, LTD., Beijing (CN); and BEIJING JINGDONG CENTURY TRADING CO., LTD., Beijing (CN)
Appl. No. 17/777,045
Filed by BEIJING JINGDONG SHANGKE INFORMATION TECHNOLOGY CO, LTD., Beijing (CN); and BEIJING JINGDONG CENTURY TRADING CO., LTD., Beijing (CN)
PCT Filed Aug. 18, 2020, PCT No. PCT/CN2020/109826 § 371(c)(1), (2) Date May 15, 2022, PCT Pub. No. WO2021/098300, PCT Pub. Date May 27, 2021.
Claims priority of application No. 201911125557.7 (CN), filed on Nov. 18, 2019.
Prior Publication US 2022/0406090 A1, Dec. 22, 2022
Int. Cl. G06V 40/16 (2022.01); G06V 10/82 (2022.01)

CPC G06V 40/162 (2022.01) [G06V 10/82 (2022.01); G06V 40/168 (2022.01)]

17 Claims

1. A face parsing method, comprising:

inputting a facial image into a pre-trained face parsing neural network;

extracting a semantic feature from the facial image using a semantic perception sub-network of the face parsing neural network, wherein the semantic feature represents probabilities that each pixel in the facial image belongs to various facial regions;

extracting a boundary feature from the facial image using a boundary perception sub-network of the face parsing neural network, wherein the boundary feature represents probabilities that each pixel in the facial image belongs to boundaries between different facial regions; and

processing the concatenated semantic feature and boundary feature using a fusion sub-network of the face parsing neural network to obtain a facial region to which each pixel in the facial image belongs,

wherein a loss function used in training the face parsing neural network includes a loss function of the semantic perception sub-network and a loss function of the boundary perception sub-network, wherein:

the loss function of the semantic perception sub-network is determined according to prediction probabilities that each pixel of the semantic feature belongs to various facial regions, and a facial region each pixel of the semantic feature actually belongs to; and

the loss function of the boundary perception sub-network is determined according to prediction probabilities that each pixel of the boundary feature belongs to boundaries between different facial regions, and whether each pixel of the boundary feature actually belongs to the boundaries.