US 12,393,623 B2
Systems and methods for generating query parameters from natural language utterances
Hanoz Bhathena, New York, NY (US); Prateek Singh, Jersey City, NJ (US); and Aviral Joshi, Seattle, WA (US)
Assigned to JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed by JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed on Feb. 14, 2023, as Appl. No. 18/168,904.
Prior Publication US 2024/0273126 A1, Aug. 15, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 16/3332 (2025.01); G06F 16/334 (2025.01)
CPC G06F 16/3347 (2019.01) [G06F 16/3334 (2019.01); G06F 16/3344 (2019.01); G06F 16/3346 (2019.01)] 8 Claims
OG exemplary drawing
 
1. A method comprising:
receiving, by a user device, an utterance as an audio file;
receiving, by an orchestration layer of a parameter generation platform, the utterance;
converting, by the orchestration layer, audio file of the utterance into a text string;
tokenizing, by a tokenization engine of the parameter generation platform, the utterance into a plurality of tokens, each token of the plurality of tokens comprising a portion of the utterance separated by a space, wherein tokenizing includes identifying one or more tokens of the plurality of tokens to be operators and symbols within the utterance;
transforming, by a featurizer engine of the parameter generation platform, the plurality of tokens into a plurality of feature vectors, the featurizer engine comprising a sparse featurizer and a dense featurizer, the sparse featurizer providing a count of frequent individual words that are filtered based on individual words occurring in a plurality of received utterances, the dense featurizer providing a semantic meaning in context by converting one or more word strings of the utterance into a real valued feature vector, the plurality of feature vectors comprising the count of frequent individual words and the real valued feature vector;
determining, by a machine learning model of the parameter generation platform, an intent classification of the utterance based on the plurality of feature vectors;
classifying, by the machine learning model, the text string into query parameters;
determining, by the machine learning model, a database of a plurality of databases to query based on the plurality of feature vectors;
assigning, by the machine learning model, an entity label to each of the plurality of feature vectors;
resolving, by a disambiguation engine of the parameter generation platform, each feature vector of the plurality of feature vectors to a corresponding standardized value of a database query language, wherein resolving the each feature vector of the plurality of feature vectors to a corresponding standardized value of a database query language includes processing the each feature vector of the plurality of feature vectors with a string-based algorithm computing a Levenshtein distance;
scoring, by the disambiguation engine, the each feature vector of the plurality of feature vectors with respect to a plurality of candidates from a reference data table;
mapping, by the disambiguation engine, the each feature vector of the plurality of feature vectors to a key value, wherein the key value corresponds to a candidate of the plurality of candidates with a highest score, the highest score calculated based on the Levenshtein distance;
determining, by the disambiguation engine, a conflict between the corresponding standardized value and the entity label; and
overriding, by the disambiguation engine, the entity label based on the determination.