US 12,253,991 B2
System, method, and computer program product for feature analysis using an embedding tree
Yan Zheng, Los Gatos, CA (US); Wei Zhang, Fremont, CA (US); Michael Yeh, Newark, CA (US); Liang Wang, San Jose, CA (US); Junpeng Wang, Santa Clara, CA (US); Shubham Jain, Mountain View, CA (US); and Zhongfang Zhuang, San Jose, CA (US)
Assigned to Visa International Service Association, San Francisco, CA (US)
Appl. No. 18/280,828
Filed by Visa International Service Association, San Francisco, CA (US)
PCT Filed Jun. 9, 2022, PCT No. PCT/US2022/032863
§ 371(c)(1), (2) Date Sep. 7, 2023,
PCT Pub. No. WO2022/261345, PCT Pub. Date Dec. 15, 2022.
Claims priority of provisional application 63/209,113, filed on Jun. 10, 2021.
Prior Publication US 2024/0152499 A1, May 9, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 16/22 (2019.01)
CPC G06F 16/2246 (2019.01) 20 Claims
OG exemplary drawing
 
1. A system for analyzing features associated with entities using an embedding tree, the system comprising:
at least one processor programmed or configured to:
receive a dataset associated with a plurality of entities, wherein the dataset comprises a plurality of data instances for the plurality of entities, wherein each data instance of the plurality of data instances comprises feature data associated with an entity of the plurality of entities, and wherein the feature data comprises a plurality of feature values of a plurality of features for the entity;
generate at least two embeddings based on the dataset associated with the plurality of entities, wherein the at least two embeddings comprise embedding data associated with the at least two embeddings, and wherein the embedding data comprises values of embedding vectors of the at least two embeddings;
determine split criteria that partitions an embedding space of at least one embedding tree associated with the dataset based on the feature data associated with an entity and the embedding data associated with the at least two embeddings; and
generate the at least one embedding tree having a plurality of nodes by splitting the embedding space based on the split criteria.