US 12,437,845 B2
Edge conditioned dynamic neighborhood aggregation based molecular property prediction
Sagar Srinivas Sakhinana, Pune (IN); Venkata Sudheendra Buddhiraju, Pune (IN); Sri Harsha Nistala, Pune (IN); and Venkataramana Runkana, Pune (IN)
Assigned to Tata Consultancy Services Limited, Mumbai (IN)
Filed by Tata Consultancy Services Limited, Mumbai (IN)
Filed on May 26, 2022, as Appl. No. 17/804,262.
Claims priority of application No. 202121046237 (IN), filed on Oct. 11, 2021.
Prior Publication US 2023/0116680 A1, Apr. 13, 2023
Int. Cl. G16C 10/00 (2019.01); G06F 17/16 (2006.01); G06N 3/048 (2023.01); G06N 3/063 (2023.01); G06N 3/08 (2023.01); G16C 20/30 (2019.01); G16C 20/70 (2019.01)
CPC G16C 10/00 (2019.02) [G06F 17/16 (2013.01); G06N 3/048 (2023.01); G06N 3/063 (2013.01); G06N 3/08 (2013.01); G16C 20/30 (2019.02); G16C 20/70 (2019.02)] 18 Claims
OG exemplary drawing
 
1. A processor-implemented method, comprising:
accessing, via one or more hardware processors, a database comprising a plurality of molecular graphs associated with a plurality of molecules and a plurality of labels indicative of chemical properties of the plurality of the molecular graphs, wherein each molecular graph of the plurality of molecular graphs comprises a plurality of sink nodes, each sink node of the plurality of sink nodes connected to a plurality of source nodes for passing neural messages through a plurality of connecting edges;
updating, via the one or more hardware processors, hidden states of the plurality of nodes of each molecular graph from amounts of the plurality of molecular graphs by aggregating encoded neural messages from the plurality of sink nodes associated with each of the molecular graphs to transform a hidden representation of each sink node from amongst the plurality of sink nodes in a plurality of iterations, wherein transforming the hidden state of a sink node from amongst the plurality of sink nodes in a current iteration from amongst the plurality of iterations comprises:
determining a first key matrix representative of a plurality of edge-incorporated neural messages sent by the plurality of source nodes to the sink node in a set of previous iterations that occurred prior to the current iteration;
determining a first value matrix representative of the plurality of edge-incorporated neural messages sent by the plurality of source nodes to the sink node in the set of previous iterations;
determining a first query matrix representative of a linearly transformed hidden state of the sink node;
determining a first set of self-attention coefficients to give weightage to the plurality of edge-incorporated neural messages sent from the plurality of source nodes, the first set of self-attention coefficients determined as a softmax transform product of the first query matrix and the first key matrix;
calculating a single message vector to be perceived by the sink node based on a matrix multiplication of the first value matrix and the first set of self-attention coefficients, wherein the single message vector determines the hidden state of the sink node in a next iteration occurring subsequent to the current iteration;
determining a second key matrix representative of the hidden state of the sink node in the set of previous iterations;
determining a second value matrix representative of the hidden state of the sink node in the set of previous iterations;
determining a second query matrix as a product of the hidden state of the sink node determined at each of the plurality of previous iterations and a query projection matrix at the current iteration step;
determining a second set of self-attention coefficients to give weightage to the hidden stage of the sink node determined at each of the plurality of previous iterations, the second set of self-attention coefficients determined as a softmax transform product of the second query matrix and the second key matrix;
calculating a self-attention based transformed hidden state of the sink node based on a product of the second set of self-attention coefficients with the second value matrix;
determining the hidden state of the sink node at the current iteration using the single message vector and the self-attention based transformed hidden state of the sink node; and
transforming the hidden state vector of the sink node to obtain a graph level embedding of the molecular graph; and
determining, via the one or more hardware processors, one or more molecular properties using a linear layer from the graph level embedding of the molecular graph.