| CPC G06F 16/24522 (2019.01) [G06F 16/215 (2019.01); G06F 16/242 (2019.01); G06F 16/24539 (2019.01)] | 20 Claims |

|
1. A computer-implemented method for querying heterogeneous data sources using machine learning based language model, the computer-implemented method comprising:
storing metadata describing a plurality of data assets, wherein each of the plurality of data assets is stored in a data source of a plurality of data sources;
receiving, from a client device, a natural language question;
generating a prompt requesting a database query using syntax of a database query language, the database query corresponding to the natural language question;
sending the prompt to a machine learning based language model;
receiving a database query using syntax of a database query language generated by the machine learning based language model, the database query including one or more generated data asset names;
for each of the one or more generated data asset names, determining a data asset corresponding to the generated data asset name based on metadata describing the data asset;
modifying the database query by replacing each of the one or more generated data asset names by a name of the data asset corresponding to the generated data asset name; and
sending the modified database query for execution.
|