1. A [
computing device implemented ] method
of extracting information from heterogeneous tables in semi-structured text and unstructured text, the method comprising steps of:
identifying , by a computing device, target content from a table in an electronic document, wherein the target content is presented in a plurality of cells [ table cell context within a document] ;
classifying , by the computing device, [ each table cell as ] the plurality of cells into one or more of [ a ] header cells and a plurality of [ cell or ] data cells [ cell ] based on at least one of explicit coding of the plurality of cells, formatting of the plurality of cells, relationship between the one or more header cells and columns in the table, presence of horizontal lines in the table, type of the target content in the plurality of cells, presence of measurement units within brackets in the table, and presence of words referring to mathematical operations on values in a table [ its context or content] ;
annotating [ directly encoding] , automatically by the computing device, the plurality of data cells [ cell with annotations ] to indicate their positions [ the data cell's position ] in the [ a ] table and an association between each of the plurality of data cells [ cell ] and the one or more header cells [ cell ] to enable extraction of the target content from the table; and
indexing, by the computing device, the electronic document utilizing the association between the plurality of data cells [ cell ] and the one or more header cells for responding to search queries [ cell] .