US 11,687,647 B2
Method and electronic device for generating semantic representation of document to determine data security risk
Madhusudana Shashanka, Austin, TX (US); Bonnie Arogyam Varghese, Milpitas, CA (US); Shankar Subramaniam, Cupertino, CA (US); Karthik Krishnan, San Jose, CA (US); and Rency Joseph, Santa Clara, CA (US)
Assigned to CONCENTRIC SOFTWARE, INC., Saratoga, CA (US)
Filed by CONCENTRIC SOFTWARE, INC., San Jose, CA (US)
Filed on Jan. 27, 2021, as Appl. No. 17/160,369.
Claims priority of provisional application 62/966,663, filed on Jan. 28, 2020.
Prior Publication US 2021/0256115 A1, Aug. 19, 2021
Int. Cl. G06F 21/55 (2013.01)
CPC G06F 21/552 (2013.01) [G06F 2221/034 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for generating semantic representation of a document using an electronic device (100) to determine data security risk associated with the document, the method comprising:
receiving, by a document semantics controller (160) of the electronic device (100), a document in an electronic form, wherein the document comprises a plurality of content;
determining, by the document semantics controller (160) of the electronic device (100), raw text from the plurality of content;
generating, by the document semantics controller (160) of the electronic device (100), a plurality of sentence blocks of a predefined size using the raw text;
determining, by the document semantics controller (160) of the electronic device (100), at least one embeddings for each of the plurality of sentence blocks;
determining, by the document semantics controller (160) of the electronic device (100), the semantic representation of the document based on the at least one embeddings for each of the plurality of sentence blocks;
generating, by the document semantics controller (160) of the electronic device (100), the semantic representation of the document to determine the data security risk associated with the document;
determining, by the document semantics controller (160) of the electronic device (100), at least one attribute of a plurality of attributes associated with a user requesting access to the document, wherein at least one attribute indicates a user security risk profile;
determining, by the document semantics controller (160) of the electronic device (100), a document security risk profile based on the semantic representation of the document and semantic representation of neighboring documents; and
determining, by the document semantics controller (160) of the electronic device (100), whether the user security risk profile matches the document security risk profile to determine access to the document.