US 11,783,462 B2
	Method and apparatus for image processing
Jenhao Hsiao, Palo Alto, CA (US); and Chiuman Ho, Palo Alto, CA (US)
Assigned to GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD., Guangdong (CN)
Filed by GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD., Guangdong (CN)
Filed on Jan. 20, 2021, as Appl. No. 17/153,439.
Application 17/153,439 is a continuation of application No. PCT/CN2019/098701, filed on Jul. 31, 2019.
Claims priority of provisional application 62/713,296, filed on Aug. 1, 2018.
Prior Publication US 2021/0142455 A1, May 13, 2021
Int. Cl. G06T 5/00 (2006.01); G06T 5/50 (2006.01); G06T 7/194 (2017.01); G06N 3/02 (2006.01); G06T 11/00 (2006.01)

CPC G06T 5/50 (2013.01) [G06N 3/02 (2013.01); G06T 7/194 (2017.01); G06T 11/00 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)]

9 Claims

1. A method for image processing, comprising:

obtaining a mask by separating an original image into a background image and a foreground image;

obtaining a partial stylized image by transforming, the foreground image according to a selected style without transforming the background image; and

obtaining, a stylized image according to the mask and the partial stylized images;

wherein obtaining the partial stylized image by transforming, the foreground image according to the selected style without transforming the background image comprises transforming the foreground image according to the selected style with an image transformation network to obtain the partial stylized image;

wherein prior to obtaining the mask, the image transformation network is trained with perceptual loss functions defined by a loss network;

wherein the loss network is a visual geometry group (VGG) network having a plurality of convolutional layers;

wherein training the image transformation network with perceptual loss functions defined by the loss network comprises:

using values of the perceptual loss functions as activations of one of the plurality convolutional layers of the loss network according to the following formula:

wherein:

j represents a j^thconvolutional layer;

C_jrepresents a number of channels input into the j^thconvolutional layer;

H_jrepresents a height of the j^thconvolutional layer;

W_jrepresents a width of the j^thconvolutional layer; and

φ_j(x) represents a feature map of shape C_j×H_j×W_j.