US 12,314,666 B2
Stable identification of entity mentions
Aaron Michael Taylor, Cambridge, MA (US); Henry Forrest Leanna Wallace, Cambridge, MA (US); John Randolph Frank, Cambridge, MA (US); and Andrew Richard Gallant, Marlborough, MA (US)
Assigned to Salesforce, Inc., San Francisco, CA (US)
Filed by salesforce.com, inc., San Francisco, CA (US)
Filed on May 3, 2021, as Appl. No. 17/306,656.
Claims priority of provisional application 63/019,033, filed on May 1, 2020.
Prior Publication US 2021/0342541 A1, Nov. 4, 2021
Int. Cl. G06F 40/10 (2020.01); G06F 16/332 (2019.01); G06F 40/295 (2020.01); G06N 5/02 (2023.01); G06N 5/04 (2023.01)
CPC G06F 40/295 (2020.01) [G06F 16/3328 (2019.01); G06N 5/02 (2013.01); G06N 5/04 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A method for stable identification of entities mentioned in dynamic documents comprising:
processing a corpus of documents to identify a first plurality of entity mentions of a first plurality of entities;
assigning a stable unique identifier to each of the first plurality of entity mentions identified within a particular document in the corpus of documents, the stable unique identifier persistently tracking particular entity mentions within particular documents in a persistent manner that spans across multiple textual revisions of the particular documents, the multiple textual revisions of the particular documents retaining the particular entity mentions from revision to revision;
storing each stable unique identifier and each corresponding one of the first plurality of entity mentions in a database;
detecting an updated document containing a change to a text in a document from the corpus of documents;
processing the updated document to identify a second plurality of entity mentions of a second plurality of entities;
contextually analyzing the text and piecewise comparing the second plurality of entity mentions to the first plurality of entity mentions to evaluate whether an entity mention identified within the text corresponds to a prior mention of the entity, based on a window of surrounding text, or surrounding sentences, or similar text;
for an aligned entity mention from the second plurality of entity mentions in the updated document corresponding to one of the first plurality of entity mentions in the document, storing the aligned entity mention in the database in association with one of the stable unique identifiers for the one of the first plurality of entity mentions;
for a new mention from the second plurality of entity mentions not corresponding to one of the first plurality of entity mentions, storing the new entity mention in the database as a new entry with a new unique and global identifier;
for a missing mention included in the first plurality of mentions and missing from the second plurality of mentions, removing a corresponding one of the unique identifiers to indicate that the missing mention has been removed from the updated document and automatically generating and displaying a knowledge graph using at least one of the aligned mentions and at least one of the new mentions in the database based on the new unique and global identifier by visually associating concept icons of the aligned mentions and the new mentions on a graphical user interface.