US 12,254,280 B2
Document classification
Eric Jon Bersagel, Fort Collins, CO (US); Sibi Mollykutty Luke, Kerala (IN); and Alex Kallolickal Joseph, Tiruvallur (IN)
Assigned to GLOBAL HEALTHCARE EXCHANGE, LLC, Lousiville, CO (US)
Filed by Global Healthcare Exchange, LLC, Louisville, CO (US)
Filed on Oct. 11, 2023, as Appl. No. 18/378,948.
Claims priority of application No. 202311027147 (IN), filed on Apr. 12, 2023.
Prior Publication US 2024/0346257 A1, Oct. 17, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 40/40 (2020.01)
CPC G06F 40/40 (2020.01) 20 Claims
OG exemplary drawing
 
1. A method comprising:
detecting, by one or more processors, one or more languages from different countries in a document;
assigning, by the one or more processors, a weight to the one or more languages from different countries;
determining, by the one or more processors, document content concepts in the document from a bag of words associated with the document, wherein the document content concepts include an attribute of a type of document;
determining, by the one or more processors, that synonyms are associated with the document content concepts, wherein the synonyms include at least one of a wording variation, abbreviation or acronym of the document content concepts that is used to link the synonyms to the document content concepts;
linking, by the one or more processors, the synonyms to the document content concepts;
assigning, by the one or more processors, a weight to each of the document content concepts, wherein the document content concepts include the synonyms to the respective document content concepts;
scoring, by the one or more processors, the document with a classification score based on the weight of each of the document content concepts and the weights of the one or more languages from different countries in the document;
determining, by the one or more processors, that the classification score meets a threshold;
determining, by the one or more processors, a pattern in the document;
determining, by the one or more processors, an object within the pattern in the document;
creating, by the one or more processors, a region around the object using x-y coordinates;
searching, by the one or more processors, in the region for data relevant to the object;
determining, by the processor, that the document lacks personal health information;
determining, by the processor, that the document lacks personal credit information;
determining, by the one or more processors, that one or more rejected keywords are not in the bag of words;
avoiding, by the one or more processors, portions of the document based on an opt-out request;
classifying, by the one or more processors, the document based on the classification score; and
assigning, by the one or more processors, and based on the classifying, the document to at least one of a release report in response to the document being a valid document or an exemption report in response to the document being a rejected document.