US 11,940,962 B2
Preparing a database for a domain specific application using a centralized data repository
Nikolaos Livathinos, Adliswil (CH); Maksym Lysak, Waedenswil (CH); Viktor Kuropiatnyk, Horgen (CH); Cesar Berrospi Ramis, Zurich (CH); Peter Willem Jan Staar, Zurich (CH); and Abderrahim Labbi, Gattikon (CH)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Dec. 9, 2021, as Appl. No. 17/546,265.
Prior Publication US 2023/0185776 A1, Jun. 15, 2023
Int. Cl. G06F 16/21 (2019.01)
CPC G06F 16/211 (2019.01) 20 Claims
OG exemplary drawing
 
1. A method for creating a database for a domain specific application, the method comprising:
providing, by one or more processors, a centralized data repository comprising data from different sources;
identifying, by one or more processors, a set of data units of the repository that represent a specific domain;
determining, by one or more processors, a pivotal entity type for an application based, at least in part, on input from one or more subject matter experts, wherein the pivotal entity type is the most significant entity type for the application;
determining, by one or more processors, based at least in part on the set of data units, a mapping between different identifiers of the pivotal entity type;
creating, by one or more processors, a reference set by selecting a first subset of the set of data units using the set of data units and the mapping, wherein the first subset of the set of data units represents the pivotal entity type;
selecting, by one or more processors, based at least in part on the mapping, a second subset of the set of data units, wherein the second subset of the set of data units represents non-pivotal entity types which are related to instances of the pivotal entity type in the reference set; and
creating, by one or more processors, a database from data units and associated attributes selected from the reference set of data units and the second subset of data units.