US 11,869,129 B2
Learning apparatus and method for creating image and apparatus and method for image creation
Gyu Sang Choi, Daegu (KR); Jong Ho Han, Gyeongsangbuk-do (KR); and Hyun Kwang Shin, Gyeongsangbuk-do (KR)
Assigned to RESEARCH COOPERATION FOUNDATION OF YEUNGNAM UNIVERSITY, Gyeongsangbuk-Do (KR)
Appl. No. 18/027,326
Filed by RESEARCH COOPERATION FOUNDATION OF YEUNGNAM UNIVERSITY, Gyeongsangbuk-do (KR)
PCT Filed Sep. 29, 2021, PCT No. PCT/KR2021/013316
§ 371(c)(1), (2) Date Mar. 20, 2023,
PCT Pub. No. WO2022/131497, PCT Pub. Date Jun. 23, 2022.
Claims priority of application No. 10-2020-0178374 (KR), filed on Dec. 18, 2020.
Prior Publication US 2023/0274479 A1, Aug. 31, 2023
Int. Cl. G06T 11/60 (2006.01); G06F 40/30 (2020.01); G06T 5/50 (2006.01); G06T 7/49 (2017.01); G06V 10/46 (2022.01); G06V 30/10 (2022.01)
CPC G06T 11/60 (2013.01) [G06F 40/30 (2020.01); G06T 5/50 (2013.01); G06T 7/49 (2017.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2210/61 (2013.01); G06V 10/469 (2022.01); G06V 30/10 (2022.01)] 6 Claims
OG exemplary drawing
 
1. A learning apparatus for image generation, comprising:
a preprocessing module configured to receive text for image generation and generate a sentence feature vector and a word feature vector from the received text;
a first generative adversarial network (GAN) configured to receive the sentence feature vector from the preprocessing module and generate an initial image based on the received sentence feature vector; and
a second generative adversarial network configured to receive the word feature vector generated by the preprocessing module and the initial image generated by the first generative adversarial network and generate a final image based on the word feature vector and the initial image,
wherein the second generative adversarial network includes:
a second generator configured to receive the word feature vector generated by the preprocessing module and the initial image generated by the first generative adversarial network, generate an enhanced image from the word feature vector and a feature map of the initial image based on a dynamic memory, generate a feature map of the enhanced image from the enhanced image by using a non-local block, and generate a final image from the word feature vector and the feature map of the enhanced image based on the dynamic memory; and
a second discriminator configured to compare the final image generated by the second generator with a preset second comparison image, determine whether the received image is the second comparison image or the generated final image according to the comparison result, and feedback the determination result to the second generator,
wherein the second generator includes:
an image enhancement module configured to generate a key and a value for storage in the dynamic memory by combining the word feature vector with the initial image, extract a key similar to the generated key from among the generated key and a key pre-stored in the dynamic memory to calculate a similarity between the generated key and the extracted key, and output a weighted sum of the values based on the calculated similarity;
an image feature generation module configured to generate an enhanced image based on the output weighted sum and the initial image; and
a non-local block module configured to generate a feature map of the enhanced image by extracting similar pixels for all regions from the generated enhanced image and resetting the pixels to an average value of similar pixels by using the non-local block.