US 11,901,047 B2
	Medical visual question answering
Yuan Zhou, Beijing (CN); Jing Mei, Beijing (CN); Shiwan Zhao, Beijing (CN); Yi Qin Yu, Beijing (CN); Xu Min, Beijing (CN); and Yan Fei Wang, Shanghai (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Oct. 28, 2020, as Appl. No. 17/082,334.
Prior Publication US 2022/0130499 A1, Apr. 28, 2022
Int. Cl. G06T 11/00 (2006.01); G16H 10/20 (2018.01); G16H 30/40 (2018.01); G16H 50/70 (2018.01); G16H 50/20 (2018.01); G06F 16/538 (2019.01); G06N 3/04 (2023.01); G06F 40/30 (2020.01); G06F 40/205 (2020.01); G06F 40/284 (2020.01); A61B 6/00 (2006.01); G16H 30/20 (2018.01); G06V 10/40 (2022.01); G06F 18/213 (2023.01); G06F 18/214 (2023.01); A61B 6/03 (2006.01); A61B 5/00 (2006.01)

CPC G16H 10/20 (2018.01) [A61B 6/5217 (2013.01); G06F 16/538 (2019.01); G06F 18/213 (2023.01); G06F 18/214 (2023.01); G06F 40/205 (2020.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G06N 3/04 (2013.01); G06T 11/00 (2013.01); G06V 10/40 (2022.01); G16H 30/20 (2018.01); G16H 30/40 (2018.01); G16H 50/20 (2018.01); G16H 50/70 (2018.01); A61B 5/0077 (2013.01); A61B 6/032 (2013.01); G06T 2210/12 (2013.01); G06V 2201/03 (2022.01)]

20 Claims

1. A computer-implemented method comprising:

extracting, by a processor, a domain-specific object feature from a first image data, wherein the feature describes an object in the first image data;

determining, by the processor, domain-specific semantic meaning of text data;

mapping, by the processor, the object feature to a portion of the text data, wherein the portion of the text data describes the object;

creating, by the processor, a joint representation of the object and the portion of the text data;

receiving, by the processor, a second image data and a query directed towards an object in the second image data; and

generating, by the processor, an answer to the query based on the joint representation.