US 11,960,484 B2
Identifying joins of tables of a database
Kireet Agrawal, Tampa, FL (US); Juliette May Hu, Valdosta, GA (US); and Aditya Singh Chand, San Jose, CA (US)
Assigned to ThoughtSpot, Inc., Mountain View, CA (US)
Filed by ThoughtSpot, Inc., San Jose, CA (US)
Filed on Oct. 13, 2021, as Appl. No. 17/500,508.
Prior Publication US 2023/0112250 A1, Apr. 13, 2023
Int. Cl. G06F 16/24 (2019.01); G06F 16/22 (2019.01); G06F 16/2453 (2019.01)
CPC G06F 16/24544 (2019.01) [G06F 16/221 (2019.01); G06F 16/24537 (2019.01)] 19 Claims
OG exemplary drawing
 
1. A method for identifying table joins, comprising:
obtaining respective casting similarities between pairs of columns of a first table and a second table, wherein a pair of columns of the pairs of columns comprises a first column of the first table and a second column of the second table, and wherein a casting similarity for the pair of columns is obtained by steps comprising:
assigning the casting similarity to the pair of columns based on an identified extent to which first data values of the first column are changeable to a data type of the second column, wherein the casting similarity is selected from a set comprising a ‘very low’ casting similarity, and wherein the ‘very low’ casting similarity is assigned to a given pair of columns in a case that one column of the given pair of columns has a BOOLEAN type and the other column of the given pair of columns has a FLOAT type;
discarding, to obtain first join candidates, ones of the pairs of columns having the respective casting similarities not satisfying a casting similarity threshold;
obtaining respective string similarities for the first join candidates;
discarding ones of the first join candidates not satisfying a string similarity condition to obtain second join candidates;
obtaining final join candidates using the respective casting similarities and the respective string similarities of the second join candidates, wherein each of the final join candidates includes a column of the first table and a column of the second table;
presenting the final join candidate on a device of a user;
receiving, from the device of the user, a selected join candidate of the final join candidates;
querying a database based on a data query that includes a join of the first table and the second table to obtain tabular data, wherein the join is based on the selected join candidate; and
outputting the tabular data.