US 12,248,760 B2
	Electronic device and method for controlling the electronic device thereof
Jiwan Kim, Suwon-si (KR); Insoo Chung, Suwon-si (KR); Jonghyun Kim, Suwon-si (KR); Soyoon Park, Suwon-si (KR); Indong Lee, Suwon-si (KR); and Sungjun Lim, Suwon-si (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on May 18, 2022, as Appl. No. 17/747,399.
Application 17/747,399 is a continuation of application No. PCT/KR2021/012761, filed on Sep. 17, 2021.
Claims priority of application No. 10-2020-0140660 (KR), filed on Oct. 27, 2020.
Prior Publication US 2022/0318524 A1, Oct. 6, 2022
Int. Cl. G06F 40/58 (2020.01); G06F 40/51 (2020.01); G06V 30/146 (2022.01); G06V 30/148 (2022.01)

CPC G06F 40/58 (2020.01) [G06F 40/51 (2020.01); G06V 30/147 (2022.01); G06V 30/153 (2022.01)]

20 Claims

11. An electronic device comprising:

a camera;

memory storing one or more computer programs; and

one or more processors communicatively coupled to the memory and the camera,

wherein the one or more computer programs include computer-executable instructions that, when executed by the one or more processors individually or collectively, cause the electronic device to:

obtain an image comprising text through the camera,

identify input text, to be translated among texts included in the image, to be translated,

obtain a first vector corresponding to the input text by inputting the input text to an encoder of a translation model,

identify whether additional information is necessary to translate the input text by inputting the first vector to a first artificial intelligence model trained to translate the input text, wherein the first artificial intelligence model is trained using a ratio value corresponding to the first vector, and wherein the additional information is identified as being necessary to translate the input text based on a result of comparing the ratio value with a preset value,

based on identifying that the additional information is necessary, identify the additional information among context information by inputting the first vector and at least one piece of context information obtained from the image to a second artificial intelligence model trained to identify the additional information, and

obtain output text corresponding to the input text by inputting the first vector and the additional information to a decoder of the translation model.