US 12,271,914 B2
Method and system for understanding financial documents
Simerjot Kaur, Jersey City, NJ (US); Charese Smiley, Bristol, WI (US); Joy Sain, Fairborn, OH (US); Suchetha Siddagangappa, Brooklyn, NY (US); Akshat Gupta, New York, NY (US); and Sameena Shah, Scarsdale, NY (US)
Assigned to JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed by JPMorgan Chase Bank, N.A., New York, NY (US)
Filed on Jan. 7, 2022, as Appl. No. 17/647,356.
Prior Publication US 2023/0237512 A1, Jul. 27, 2023
Int. Cl. G06F 40/295 (2020.01); G06N 5/02 (2023.01); G06Q 30/0201 (2023.01)
CPC G06Q 30/0201 (2013.01) [G06F 40/295 (2020.01); G06N 5/02 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for characterizing information contained in a financial document, the method being implemented by at least one processor, the method comprising:
receiving, by the at least one processor, a document;
extracting, by the at least one processor, raw text included in the document;
identifying, by the at least one processor based on the extracted raw text, a plurality of entities that are named in the document;
determining, by the at least one processor based on the extracted raw text, respective relationship information that corresponds to at least one pair of entities from among the plurality of entities by applying a Bidirectional Encoder Representations from Transformers (BERT) model that is trained to learn contextual representation surrounding the plurality of entities and that uses a distance supervision paradigm and an entity type restricted relation classifier to determine the respective relationship information by:
identifying a first entity pair from among two entities of the plurality of entities that are a positive example;
calculating a first shortest dependency path between the two entities of the first entity pair;
identifying a second entity pair from among two entities of the plurality of entities;
calculating a second shortest dependency path between the two entities of the second entity pair;
determining a similarity between the first shortest dependency path and the second shortest dependency path; and
determining, based on the determined similarity, whether the second entity pair is in a positive example category or a negative example category; and
outputting, by the at least one processor, a subset of the determined respective relationship information.