US 12,141,181 B2
	Database query generation using natural language text
Jaya Prakash Narayana Gutta, New York, NY (US); Sharad Malhautra, New York, NY (US); and Lalit Gupta, Bangalore (IN)
Assigned to DSilo Inc., New York, NY (US)
Filed by DSilo Inc., New York, NY (US)
Filed on Sep. 25, 2023, as Appl. No. 18/473,939.
Application 18/473,939 is a continuation of application No. 18/073,815, filed on Dec. 2, 2022, granted, now 11,860,916.
Application 18/073,815 is a continuation of application No. 17/877,365, filed on Jul. 29, 2022, granted, now 11,520,815, issued on Dec. 6, 2022.
Claims priority of provisional application 63/227,790, filed on Jul. 30, 2021.
Claims priority of provisional application 63/227,793, filed on Jul. 30, 2021.
Claims priority of provisional application 63/227,796, filed on Jul. 30, 2021.
Prior Publication US 2024/0028629 A1, Jan. 25, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/33 (2019.01); G06F 16/31 (2019.01); G06F 16/35 (2019.01); G06F 40/186 (2020.01); G06F 40/279 (2020.01); G06F 40/295 (2020.01); G06N 20/20 (2019.01); G06Q 50/18 (2012.01)

CPC G06F 16/3344 (2019.01) [G06F 16/31 (2019.01); G06F 16/3347 (2019.01); G06F 16/355 (2019.01); G06F 40/186 (2020.01); G06F 40/279 (2020.01); G06F 40/295 (2020.01); G06N 20/20 (2019.01); G06Q 50/18 (2013.01)]

16 Claims

1. A method of querying a computer database system, the method comprising:

determining, by a database system, a set of natural language questions associated with a given source of questions, the set of natural language questions comprising:

a current natural language question associated with the source; and

a set of prior natural language questions comprising natural language questions associated with the source and submitted prior to submission of the current natural language question;

determining, by a first question encoder of the database system, and based on the current natural language question, a first question model output comprising question vectors, the first question encoder employing a first bidirectional long-short-term memory (BiLSTM) model to generate a set of question vectors based on n-gram scores for n-grams in the current natural language question, an attention operation is conducted on the set of question vectors to generate an attention weighted set of question vectors, a concatenation operation is performed on the attention weighted set of question vectors to generate a concatenated set of question vectors, and the first question encoder employs a second BiLSTM model to generate, based on the concatenated set of question vectors, the question vectors;

determining, by a second question encoder of the database system and based on one or more questions of the set of prior natural language questions, a second question model output comprising context parameters;

determining, by a third question encoder of the database system and based on the question vectors and the context parameters, a set of concatenated question vectors for the question vector and the context parameters;

determining, by the database system, a selected database corresponding to the current natural language question, the selected database comprising features having features names and feature values;

determining, by a table encoder of the database system and based on the features of the selected database, a table encoder model output comprising word vectors;

determining, by decoder of the database system and based on the set of concatenated question vectors and the word vectors, a set of strings comprising a feature name of the feature names of the data table and a database language operator;

determining, by a transformer model of the database system and based on the set of strings and the database language operator, a set of values;

generating, by the database system based on the set of strings and the set of values, a database language query for accessing data stored in a database; and

executing, by the database system, the database language query to retrieve, from a datastore, a set of data corresponding to the database language query,

presenting the set of data retrieved to a user in response to the execution of the database language query.