US 11,687,514 B2
Multimodal table encoding for information retrieval systems
Roee Shraga, Haifa (IL); Haggai Roitman, Yoknea'm Elit (IL); Guy Feigenblat, Givataym (IL); and Mustafa Canim, Ossining, NY (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Jul. 15, 2020, as Appl. No. 16/929,194.
Prior Publication US 2022/0043794 A1, Feb. 10, 2022
Int. Cl. G06F 16/22 (2019.01); G06F 16/93 (2019.01); G06F 16/245 (2019.01); G06F 16/21 (2019.01)
CPC G06F 16/2282 (2019.01) [G06F 16/212 (2019.01); G06F 16/245 (2019.01); G06F 16/93 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A method comprising, automatically:
receiving an electronic document that contains a table, wherein:
the table comprises: multiple rows, multiple columns, and a schema comprising column labels or row labels, and
the electronic document comprises a description of the table which is located externally to the table;
operating separate machine learning encoders to separately encode the description of the table, the schema of the table, each of the rows of the table, and each of the columns of the table, respectively, the machine learning encoders including a neural network to encode the description of the table, wherein:
the schema of the table is encoded together with end-of-column tokens or end-of-row tokens, that mark an end of each of the column or row labels, respectively,
each of the rows of the table is encoded together with end-of-column tokens that mark an end of each data cell of the respective row, and with an end-of-row token that marks an end of the respective row, and
each of the columns of the table is encoded together with end-of-row tokens that mark an end of each data cell of the respective column, and with an end-of-column token that marks an end of the respective column;
applying a machine learning gating mechanism to the encoded description, encoded schema, encoded rows, and encoded columns, to produce a fused encoding of the table, wherein the fused encoding is representative of both a structure of the table and a content of the table; and
storing the fused encoding of the table in an index of a computerized information retrieval system.