US 11,734,510 B2
Natural language processing of encoded question tokens and encoded table schema based on similarity
Wangsu Hu, Chicago, IL (US); and Jilei Tian, Chicago, IL (US)
Assigned to Bayerische Motoren Werke Aktiengesellschaft, Munich (DE)
Filed by Bayerische Motoren Werke Aktiengesellschaft, Munich (DE)
Filed on Aug. 27, 2020, as Appl. No. 17/4,051.
Prior Publication US 2022/0067281 A1, Mar. 3, 2022
Int. Cl. G06F 40/226 (2020.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01)
CPC G06F 40/284 (2020.01) [G06F 40/30 (2020.01)] 11 Claims
OG exemplary drawing
 
1. A computer-implemented method for optimizing execution of natural language to structured query language, comprising the steps of:
a. receiving a natural language text input;
b. performing natural language processing on the natural language text input;
c. generating a plurality of encoded question tokens;
d. performing natural language processing on a plurality of table schema stored in a database;
e. generating a plurality of encoded table schema tokens for each processed table schema of the plurality of table schema;
f. determining a similarity between the plurality of encoded question tokens and the plurality of encoded table schema tokens for at least two table schemas of the plurality of table schema;
g. determining an output table schema from the plurality of table schema based on the similarity,
wherein determining the output table schema comprises:
selecting a table schema with the highest similarity from the plurality of table schema, and
rejecting the output table schema if a similarity is below a threshold;
h. executing a validation query for the output table schema to determine whether the output table schema has valid information for outputting the natural language string, and, when the output table schema has no valid information:
1. Rejecting the output table schema,
2. selecting a further outputting table schema,
3. Executing the validation query for the further output table schema, and
4. repeating the steps 1.-3. when the further output table schema has no valid information; and
i. outputting a natural language string based on the output table schema.