CPC G06F 16/245 (2019.01) [G06F 16/26 (2019.01); G06F 16/35 (2019.01); G06F 16/353 (2019.01); G06F 16/93 (2019.01); G06F 16/9535 (2019.01); G06F 40/295 (2020.01)] | 12 Claims |
1. A system for identifying an entity having specified entity attribute in a document, the system comprising:
one or more processors; and
a memory, coupled to the one or more processors, storing code that when executed by the one or more processors causes the one or more processors to perform operations comprising:
receiving, from each process of a plurality of processes, a corresponding set of candidate identity attributes that are each for identifying a particular entity having the specified entity attribute in the document, wherein each process of the plurality of processes generates the corresponding set of candidate identity attributes based on the specified entity attribute in the document;
calculating a score for each candidate identity attribute in the set of candidate identity attributes, wherein calculating the score for a particular candidate identity attribute comprises (1) identifying a set of tokens in the particular candidate identity attribute, (2) assigning a value to each token in the sets of tokens based on a token count that represents a number of instances of the token across the set of candidate identity attributes and (3) calculating the score based on the assigned values; and
identifying, based on the scores calculated for the candidate identity attributes, an identity attribute from the set of candidate identity attributes that identifies the entity having the specified entity attribute in the document.
|