| CPC G06F 40/30 (2020.01) [G06F 40/126 (2020.01); G06F 40/205 (2020.01); G06F 40/279 (2020.01); G06F 40/40 (2020.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] | 17 Claims |

|
1. A method for providing semantic encoding and language generation in a computing system by a processor, comprising:
automatically parsing unstructured data into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model;
encoding, using the first machine learning model, the unstructured data into a distribution of a plurality of triples based on the one or more knowledge graphs, wherein the encoding further comprises predicted probabilities of relations between entities in the unstructured data;
sampling, using a second machine learning model, a set of the plurality of triples from the unstructured data of the one or more knowledge graphs;
generating text data from the set of the plurality of triples using the second machine learning model;
computing a penalty score for the set of the plurality of triples based on a degree of difference between the unstructured data and the generated text data; and
adjusting at least one predicted probability from the first machine learning model based on the determined penalty score.
|