US 12,254,544 B2
Image-text fusion method and apparatus, and electronic device
Wenjie Zhang, Nanjing (CN); Weicai Zhong, Xi'an (CN); and Liang Hu, Shenzhen (CN)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Appl. No. 17/634,002
Filed by HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
PCT Filed Aug. 4, 2020, PCT No. PCT/CN2020/106900
§ 371(c)(1), (2) Date Feb. 9, 2022,
PCT Pub. No. WO2021/036715, PCT Pub. Date Mar. 4, 2021.
Claims priority of application No. 201910783866.7 (CN), filed on Aug. 23, 2019.
Prior Publication US 2022/0319077 A1, Oct. 6, 2022
Int. Cl. G06T 11/60 (2006.01); G06T 11/00 (2006.01); G06V 10/44 (2022.01); G06V 10/46 (2022.01); G06V 10/54 (2022.01); G06V 10/56 (2022.01); G06V 20/62 (2022.01); G06V 40/16 (2022.01)
CPC G06T 11/60 (2013.01) [G06T 11/001 (2013.01); G06V 10/44 (2022.01); G06V 10/462 (2022.01); G06V 10/54 (2022.01); G06V 10/56 (2022.01); G06V 20/62 (2022.01); G06V 40/168 (2022.01)] 20 Claims
OG exemplary drawing
 
1. An image-text fusion method, wherein the method comprises:
obtaining a first image and a first text to be laid out in the first image;
determining a feature value of each pixel in the first image, wherein a feature value of a pixel is used to represent a probability that a user pays attention to the pixel, wherein the probability that the user pays attention to the pixel is higher for greater feature values of the pixel;
determining a plurality of first layout formats of the first text in the first image based on the first text and the feature value of each pixel in the first image, wherein when the first text is laid out in the first image based on each first layout format, the first text does not block a pixel whose feature value is greater than a first threshold;
determining a second layout format from the plurality of first layout formats based on cost parameters of the plurality of first layout formats, wherein a cost parameter of a first layout format is used to represent a magnitude of a feature value of a pixel blocked by the first text when the first text is laid out in the first image based on the first layout format, and a balance degree of feature value distribution of pixels in each region in the first image in which the first text is laid out; and
laying out the first text in the first image based on the second layout format to obtain a second image.