US 12,346,315 B1
Natural language query processing
Sheng Zhang, New Jersey, NJ (US); Patrick Ng, Great Neck, NY (US); Zhiguo Wang, Syosset, NY (US); Anuj Chauhan, New York, NY (US); Jiarong Jiang, Scarsdale, NY (US); Rishav Chakravarti, White Plains, NY (US); Stephen Michael Ash, Seattle, WA (US); Bing Xiang, Mount Kisco, NY (US); and Gregory David Adams, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 21, 2023, as Appl. No. 18/187,569.
Int. Cl. G06F 16/24 (2019.01); G06F 16/242 (2019.01); G06F 16/2457 (2019.01); G06F 16/248 (2019.01)
CPC G06F 16/243 (2019.01) [G06F 16/24578 (2019.01); G06F 16/248 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving a natural language query (NLQ);
performing a first lexical query to retrieve metadata for the NLQ from an index of cell values and column names generated from topic metadata and structured datasets;
performing named entity recognition (NER) on the NLQ to determine two or more entities of the NLQ;
performing a second lexical query to retrieve a first set of linkable candidates, the set of linkable candidates identifying one or more columns or cells of one or more datasets likely to be associated with the determined two or more entities;
determining relations between the determined two or more entities of the NLQ, wherein determining relations comprises modifying the NLQ to include:
tokens to delineate at least a proper subset of the determined two or more entities of the NLQ,
entity type information for at least a proper subset of the determined two or more entities of the NLQ, wherein the entity type information includes an entity type value describing a semantic role that an entity is predicted to perform in the NLQ, and
an indication of a semantic relationship between at least two entities of the determined two or more entities;
performing named entity linking (NEL) using the set of linkable candidates and the modified NLQ to identify and rank candidate linkages between each of the determined two or more entities and the set of linkable candidates;
ranking datasets identified by the ranked candidate linkages for the determined two or more entities of the NLQ;
selecting one of multiple intent representations generated for the NLQ according to the ranked candidate linkages and the ranked datasets to be an intent representation for the NLQ according to a score for the intent representation;
causing the intent representation to be used to execute the NLQ; and
returning, via an interface, a result for the execution of the NLQ.