| CPC G06F 16/24544 (2019.01) [G06F 11/3409 (2013.01); G06F 16/24561 (2019.01); G06F 16/256 (2019.01)] | 20 Claims |

|
1. A computer-implemented method comprising:
receiving at a data virtualization layer (DV), by one or more processors, a join query request related to a plurality of tables {T1, T2, . . . , Tn} respectively stored in a plurality of distributed database servers {S1, S2, . . . , Sn}, wherein n is an integer larger than 1;
implementing a catalog agent for the data virtualization layer;
mapping tables, using the catalog agent, in the next database server into virtual tables;
generating, by the one or more processors, a plurality of candidate query plans for the join query request, each of the plurality of candidate query plans indicating an order for transmitting the tables {T1, T2, . . . , Tn} respectively stored in the database servers {S1, S2, . . . , Sn} to the DV, wherein for one candidate query plan P={Si→Si→ . . . Sn→, . . . , →DV}, 1≤i, j≤n, table Ti stored in database server Si is transmitted to and stored in database server Sj before being transmitted together with table Tj which is stored in database server Sj to a next database server, and wherein in at least one of the database servers {S1, S2, . . . , Sn}, a join work is performed on the stored tables based on the join query request before being transmitted to the next database server, the join work being performed by the catalog agent;
for each of the plurality of candidate query plans, calculating, by the one or more processors, a query cost for the candidate query plan based on a data amount of the tables to be transmitted according to the candidate query plan; and
determining, by the one or more processors, from the plurality of candidate query plans, a query plan for the join query request which has a lowest query cost.
|