US 12,332,906 B1
Automatic data analysis formula phrase generation
Pankaj Kulkarni, Bangalore (IN); Anurag Tomer, Ghaziabad (IN); Kedar Milind Kulkarni, Bangalore (IN); Alok Yadav, New Delhi (IN); and Akshay Mehra, Bengaluru (IN)
Assigned to ThoughtSpot, Inc., Mountain View, CA (US)
Filed by ThoughtSpot, Inc., Mountain View, CA (US)
Filed on Apr. 10, 2024, as Appl. No. 18/631,342.
Int. Cl. G06F 16/248 (2019.01); G06F 16/2452 (2019.01); G06F 16/2455 (2019.01)
CPC G06F 16/248 (2019.01) [G06F 16/24522 (2019.01); G06F 16/2455 (2019.01)] 19 Claims
OG exemplary drawing
 
1. A method comprising:
obtaining, by a data access and analysis system, first user input data including a natural language string, wherein the natural language string expresses a data-analysis formula phrase in a form that is inconsistent with a defined data-analysis-formula grammar implemented by the data access and analysis system, wherein the data access and analysis system implements defined data-analysis formula phrases in accordance with the defined data-analysis-formula grammar, wherein the defined data-analysis formula phrases are, respectively, associated with defined data-analysis formula phrase categories;
obtaining, by the data access and analysis system, first large language model input data including the natural language string and a first proper subset of the defined data-analysis formula phrases that is diverse with respect to the defined data-analysis formula phrase categories;
obtaining, by the data access and analysis system, first large language model generated data output by a large language model in response to the first large language model input data;
identifying, by the data access and analysis system, in accordance with the first large language model generated data, a proper subset of the defined data-analysis formula phrase categories;
obtaining, by the data access and analysis system, second large language model input data including the natural language string and a second proper subset of the defined data-analysis formula phrases obtained in accordance with the proper subset of the defined data-analysis formula phrase categories;
obtaining, by the data access and analysis system, second large language model generated data output by the large language model in response to the second large language model input data, wherein the second large language model generated data includes an automatically generated data-analysis formula phrase generated by the large language model to expresses the natural language string;
obtaining, by the data access and analysis system, a data-analysis formula phrase object as an internal representation of the automatically generated data-analysis formula phrase;
obtaining, by the data access and analysis system, second user input data that expresses a request for data analysis with respect to data stored in a data source of the data access and analysis system, wherein the request for data includes a data-analysis formula phrase name of the automatically generated data-analysis formula phrase;
obtaining, by the data access and analysis system, responsive to the request for data, resolved request data including data referring to the data-analysis formula phrase object;
obtaining, by the data access and analysis system, a data query in accordance with the resolved request data, the data-analysis formula phrase object, and a defined structured query language implemented by the data source;
obtaining, by the data access and analysis system, results data responsive to the request for data generated by execution of the data query by the data source; and
outputting results presentation data for presenting one or more portions of the results data.