US 11,675,825 B2
Method and system for principled approach to scientific knowledge representation, extraction, curation, and utilization
Andrew Walter Crapo, Niskayuna, NY (US); Nurali Virani, Niskayuna, NY (US); and Varish Mulwad, Niskayuna, NY (US)
Assigned to GENERAL ELECTRIC COMPANY, Schnectady, NY (US)
Filed by GENERAL ELECTRIC COMPANY, Schenectady, NY (US)
Filed on Feb. 14, 2020, as Appl. No. 16/791,617.
Claims priority of provisional application 62/805,772, filed on Feb. 14, 2019.
Prior Publication US 2020/0265060 A1, Aug. 20, 2020
Int. Cl. G06F 16/36 (2019.01); G06F 16/25 (2019.01); G06F 16/245 (2019.01); G06N 5/02 (2023.01)
CPC G06F 16/367 (2019.01) [G06F 16/245 (2019.01); G06F 16/254 (2019.01); G06N 5/02 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A system comprising:
a memory storing processor-executable instructions; and
a processor to execute the processor-executable instructions to cause the system to:
extract information from at least one of code and text documentation, the extracted information conforming to a base ontology and being extracted in the context of a knowledge graph, the extracting of information from code includes representing the code as an abstract syntax tree (AST), extract a set of equations from the codes based on an analysis of the AST, and expressing the extracted set of equations as a computational graph with semantic metadata characterizing the computational graph; and the extracting of information from text documentation includes identifying and extracting scientific concepts and equations in the text, reconciling extracted concepts with one or more existing concepts in the knowledge graph, identifying concepts between the extracted concepts, and expressing the extracted and reconciled concepts and the relations in a format compatible with the base ontology, wherein the extracting of information from at least one of code and text documentation comprises extracting aligned information from a plurality of sources of the at least one of code and text documentation;
add the extracted information to the knowledge graph, including the computational graph for the extracted code and the extracted and reconciled concepts and the relations for the extracted text;
generate, in a mixed interaction with a user selectively in communication with the system, computational models including scientific knowledge, wherein the mixed interaction includes statements between the user and the system that are initiated by either the user or the system and the statements are included in the computational models with annotations to their location; and
persist, in a memory, a record of the generated computational models.