CPC G06F 16/242 (2019.01) [G06F 16/245 (2019.01); G06F 40/205 (2020.01); G06F 40/30 (2020.01)] | 14 Claims |
1. A method for generating data pair, comprising:
generating M Structured Query Language SQL query statements for a given database, where M is a positive integer greater than one;
performing the following processing for each SQL query statement: dividing the SQL query statement into at least one SQL clause; obtaining a question description corresponding to each SQL clause; combining the question descriptions to obtain a question corresponding to the SQL query statement,
wherein the obtaining a question description corresponding to each SQL clause comprises:
for any SQL clause, generating the question description corresponding to the SQL clause by using a pre-trained generation model,
wherein training to obtain the generation model comprises:
constructing an SQL clause-question description pair according to an existing question-SQL query statement pair, and training according to the SQL clause-question description pair to obtain the generation model,
wherein the constructing an SQL clause-question description pair according to an existing question-SQL query statement pair comprises:
performing the following processing for any question-SQL query statement pair:
dividing the SQL query statement in the question-SQL query statement pair into at least one SQL clause;
obtaining the question description corresponding to each SQL clause; wherein the question description corresponding to any SQL clause includes: problem fragments of the question in the question-SQL query statement pair covering all units in the SQL clause.
|