CPC G16C 20/30 (2019.02) [G16C 20/70 (2019.02)] | 20 Claims |
1. A method, in a data processing system comprising at least one processor and a memory comprising instructions which, when executed by the at least one processor, causes the at least one processor to implement a real-time prediction engine for real-time prediction of chemical properties through combining calculated, structured, and unstructured data at large scale, the method comprising:
inputting, for each of a plurality of chemical structures, into a chemical information processor, unstructured chemical features and properties extracted, by a natural language processing job server, from one or more unstructured chemical information sources and structured chemical features and properties extracted, by a structured data processor, from structured chemical information sources;
calculating, by the chemical information processor, for each of a plurality of chemical structures, calculated chemical structure features and properties based on the unstructured chemical features and properties, and the structured chemical features and properties;
storing, by offline components executing within the real-time prediction engine, a computational representation for each of the plurality of chemical structures in a unified storage, wherein each computational representation maps a respective chemical structure to a vector of corresponding calculated chemical structure features and properties, corresponding unstructured chemical features and properties, and corresponding structured chemical features and properties;
training, by the offline components using a machine learning training operation, a computational real-time predictive model based on the computational representations as inputs to the computational real-time predictive model, wherein the computational real-time predictive model is trained to predict properties based on an input chemical compound;
receiving, by a user interface executing within the real-time prediction engine, a request specifying one or more chemical compounds;
predicting, by an analytics jobs manager executing within the real-time prediction engine, one or more properties of the one or more chemical compounds using the computational real-time predictive model; and
outputting, by the analytics jobs manager, the one or more properties of the one or more chemical compounds to the user interface, wherein the computational real-time predictive model comprises a machine learning model, a deep learning model, or a neural network.
|