US 11,704,343 B2
Method and system for advanced adaptive database matching
Bernd Reimann, Heerbrugg (CH); Alexandre Heili, Altstätten (CH); Nicholas Bade, Widnau (CH); Akshit Budhraja, Altstätten (CH); Krishan Kumar Meghani, Hyderabad (IN); Jyotirmoy Verma, Hyderabad (IN); and Kumara Chandra Singarapu, Hyderabad (IN)
Assigned to HEXAGON TECHNOLOGY CENTER GMBH, Heerbrugg (CH); and HEXAGON CAPABILITY CENTER INDIA PRIVATE LIMITED, Hyderabad (IN)
Filed by HEXAGON TECHNOLOGY CENTER GMBH, Heerbrugg (CH); and HEXAGON CAPABILITY CENTER INDIA PRIVATE LIMITED, Telangana (IN)
Filed on May 19, 2021, as Appl. No. 17/324,755.
Claims priority of application No. 202011021043 (IN), filed on May 19, 2020.
Prior Publication US 2021/0365479 A1, Nov. 25, 2021
Int. Cl. G06F 16/245 (2019.01); G06F 16/28 (2019.01); G06F 16/248 (2019.01); G06N 20/00 (2019.01)
CPC G06F 16/285 (2019.01) [G06F 16/245 (2019.01); G06F 16/248 (2019.01); G06N 20/00 (2019.01)] 21 Claims
OG exemplary drawing
 
1. A method of associating data from a plurality of databases, the method comprising:
accessing a first database and a second database, wherein the first database comprises a first dataset of object descriptions associated with a plurality of objects as per a first schema, and wherein the second database comprises a second dataset of object descriptions associated with the plurality of objects as per a second schema, wherein each object of the plurality of objects is a digital representation of a corresponding physical world item, the items comprising assets at industrial or construction facilities, the assets including specific pipes, specific valves and specific gaskets;
identifying a first set of expressions and a second set of expressions corresponding to the first dataset and the second dataset, respectively, wherein each expression of the first set of expressions and the second set of expressions comprises of at least one entry encoded using alphanumerical characters and defines an attribute of an object from the plurality of objects;
determining a first set of clusters and a second set of clusters corresponding to the first database and the second database, respectively, based on at least one of domain data associated with the plurality of objects, an object category associated with the plurality of objects, a set of domain rules, the first set of expressions, and the second set of expressions, wherein each cluster in the first set of clusters comprises a corresponding first set of contextually similar objects from the plurality of objects and each cluster in the second set of clusters comprises a corresponding second set of contextually similar objects from the plurality of objects, the object category comprising a hierarchical categorization of pipes, valves and gaskets; and
creating a relational database based on a set of relationships and a mapping function determined based on the first set of clusters and the second set of clusters, wherein the relational database comprises a mapping between at least one of the first set of clusters and the second set of clusters, the first set of expressions and the second set of expressions, the first set of contextually similar objects and the second set of contextually similar objects, and the first schema and the second schema,
wherein the determining of the first set of clusters and second set of clusters comprises
performing an Artificial Intelligence (AI) based technique comprising:
analysing the first set of expressions and the second set of expressions; and
determining the first set of clusters and the second set of clusters based on a semantic and contextual similarity between the expressions in the first set of expressions and the second set of expressions, respectively.