US 12,309,211 B1
	Automatic image translation for virtual meetings
Claudio Fantinuoli, New York, NY (US)
Assigned to KUDO, INC., New York, NY (US)
Filed by KUDO, INC., New York, NY (US)
Filed on Mar. 8, 2023, as Appl. No. 18/119,188.
Application 18/119,188 is a continuation of application No. 17/373,494, filed on Jul. 12, 2021, abandoned.
Int. Cl. H04L 65/403 (2022.01); G06N 20/00 (2019.01); G06V 30/148 (2022.01); G10L 21/10 (2013.01)

CPC H04L 65/403 (2013.01) [G06N 20/00 (2019.01); G06V 30/153 (2022.01); G10L 21/10 (2013.01); G06V 30/148 (2022.01)]

20 Claims

1. A method comprising:

identifying, by a processor executing a machine-learning model, visual presentation content shared by a first electronic device of a first participant to a second electronic device of a second participant and a third electronic device of a third participant during an electronic communication session;

decoupling, by the processor, using an image segmentation protocol, a region of the visual presentation content from other visual elements shared during the electronic communication session;

executing an image recognition protocol on the region and not the other visual elements to identify a first set of words of the visual presentation content displayed within the region, wherein the first set of words is in a first language;

identifying, by the processor, a second preferred language of the second participant of the electronic communication session and a third preferred language of the third participant of the electronic communication session;

presenting, by the processor as a first overlay on at least a portion of the region during the electronic communication session on the second electronic device associated with the second participant, a second set of words in the second preferred language of the second participant, the second set of words corresponding to the first set of words;

presenting, by the processor as a second overlay on at least a portion of the region during the electronic communication session on the third electronic device associated with the third participant, a third set of words in the third preferred language of the third participant, the third set of words corresponding to the first set of words;

monitoring, by the processor, a set of pixels within the region to identify a visual revision within the region that satisfies a threshold of pixel differentiation between the set of pixels in a first pixel capture and the set of pixels in a second pixel capture; and

in response to identifying the visual revision:

executing, by the processor, the image recognition protocol to identify a fourth set of words within the region having the visual revision, wherein the fourth set of words is in the first language;

presenting, by the processor within the first overlay on the second electronic device, a fifth set of words, in the second preferred language, corresponding to the fourth set of words; and

presenting, by the processor within the second overlay on the third electronic device, a sixth set of words, in the third preferred language, corresponding to the fourth set of words.