| CPC G06F 16/21 (2019.01) [G06F 16/2358 (2019.01); G06F 16/2379 (2019.01); G06F 16/2458 (2019.01); G06F 16/248 (2019.01); G06F 16/287 (2019.01); G06F 16/29 (2019.01); G06F 16/9024 (2019.01); G06F 16/9535 (2019.01); G06Q 30/02 (2013.01); G06Q 40/06 (2013.01)] | 32 Claims |

|
1. A computer-implemented method for determining attributes of a first set of entities based on attributes of a second set of entities, the method comprising:
receiving, by a processor, a dataset comprising attribute data for the second set of entities;
generating, by the processor, a multi-dimensional space by:
mapping at least one attribute to a dimension in the space, and
assigning coordinates to at least one entity in the second set based on its attribute values;
analyzing, by the processor, stored data for the first set of entities;
positioning, by the processor, representations of the first set of entities within the multi-dimensional space using a machine learning algorithm trained on the positions of the second set of entities;
calculating, by the processor, proximities between the representations of the first set of entities and the second set of entities within the multi-dimensional space using a distance metric;
determining, by the processor, attributes for the first set of entities by:
identifying a subset of nearest neighbors from the second set for each entity in the first set
by applying a nearest neighbor algorithm to the calculated proximities, applying a pre-processing algorithm to the attributes of the subset of nearest neighbors, inferring at least one missing attribute value for the first set of entities; and
outputting, by the processor, the determined attributes for the first set of entities.
|