US 11,748,340 B2
Data pair generating method, apparatus, electronic device and storage medium
Lijie Wang, Beijing (CN); and Ao Zhang, Beijing (CN)
Assigned to BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD., Beijing (CN)
Filed by BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD., Beijing (CN)
Filed on Jul. 23, 2021, as Appl. No. 17/383,642.
Claims priority of application No. 202011410065.5 (CN), filed on Dec. 3, 2020.
Prior Publication US 2022/0179847 A1, Jun. 9, 2022
Int. Cl. G06F 16/00 (2019.01); G06F 16/242 (2019.01); G06F 16/245 (2019.01); G06F 40/205 (2020.01); G06F 40/30 (2020.01)
CPC G06F 16/242 (2019.01) [G06F 16/245 (2019.01); G06F 40/205 (2020.01); G06F 40/30 (2020.01)] 14 Claims
OG exemplary drawing
 
1. A method for generating data pair, comprising:
generating M Structured Query Language SQL query statements for a given database, where M is a positive integer greater than one;
performing the following processing for each SQL query statement: dividing the SQL query statement into at least one SQL clause; obtaining a question description corresponding to each SQL clause; combining the question descriptions to obtain a question corresponding to the SQL query statement,
wherein the obtaining a question description corresponding to each SQL clause comprises:
for any SQL clause, generating the question description corresponding to the SQL clause by using a pre-trained generation model,
wherein training to obtain the generation model comprises:
constructing an SQL clause-question description pair according to an existing question-SQL query statement pair, and training according to the SQL clause-question description pair to obtain the generation model,
wherein the constructing an SQL clause-question description pair according to an existing question-SQL query statement pair comprises:
performing the following processing for any question-SQL query statement pair:
dividing the SQL query statement in the question-SQL query statement pair into at least one SQL clause;
obtaining the question description corresponding to each SQL clause; wherein the question description corresponding to any SQL clause includes: problem fragments of the question in the question-SQL query statement pair covering all units in the SQL clause.