US 12,249,430 B1
Predicting reliability of structured data records generated using an extraction neural networks
Zachary Michael Ziegler, Cambridge, MA (US); Jonas Sebastian Wulff, Glendale, CA (US); Evan Hernandez, Wimauma, FL (US); and Daniel Joseph Nadler, Nassau (BS)
Assigned to Xyla Inc., Wilmington, DE (US)
Filed by Xyla Inc., Wilmington, DE (US)
Filed on Aug. 20, 2024, as Appl. No. 18/810,328.
Application 18/810,328 is a continuation of application No. 18/219,027, filed on Jul. 6, 2023.
Claims priority of provisional application 63/368,434, filed on Jul. 14, 2022.
Int. Cl. G16H 50/70 (2018.01); G16H 10/20 (2018.01)
CPC G16H 50/70 (2018.01) [G16H 10/20 (2018.01)] 20 Claims
OG exemplary drawing
 
1. A method performed by one or more computers, the method comprising:
processing an input text sequence, using an extraction neural network to generate, an output text sequence that defines a corresponding structured data record, comprising, for each position in the output text sequence:
processing a sequence of embeddings representing the input text sequence and any part of the output text sequence preceding the position in the output text sequence in accordance with trained values of a set of extraction neural network parameters to generate a score distribution over a set of tokens; and
selecting a token, in accordance with the score distribution over the set of tokens, to occupy the position in the output text sequence;
wherein the extraction neural network has been trained by a machine learning training technique to perform a natural language understanding task;
wherein the structured data record represents information from the input text sequence in a format that is structured with reference to a predefined schema of semantic categories; and
wherein the structured data record comprises, for each semantic category in the schema, a text string that expresses information from the input text sequence that is relevant to the semantic category;
processing the structured data record to evaluate whether the structured data record satisfies each of one or more reliability criteria, comprising:
determining a confidence of the extraction neural network in generating the text string included in a semantic category of the structured data record; and
determining whether a reliability criterion is satisfied based on the confidence of the extraction neural network in generating the text string included in the semantic category; and
generating a reliability prediction characterizing a predicted reliability of information included in the structured data record based on a result of evaluating whether the structured data record satisfies the reliability criteria;
blocking the structured data record from being included in a database of structured data records based on the reliability prediction; and
generating and outputting a response to a user query based on the database of structured data records.