US 12,423,328 B2
Systems, methods, and apparatuses for automatically classifying data based on data usage and accessing patterns in an electronic network
Marcus Raphael Matos, Richardson, TX (US); Richard Scot, Huntersville, NC (US); Daniel Joseph Serna, The Colony, TX (US); and Matthew K. Bryant, Gastonia, NC (US)
Assigned to BANK OF AMERICA CORPORATION, Charlotte, NC (US)
Filed by BANK OF AMERICA CORPORATION, Charlotte, NC (US)
Filed on Nov. 29, 2022, as Appl. No. 18/071,227.
Prior Publication US 2024/0176804 A1, May 30, 2024
Int. Cl. G06F 16/28 (2019.01)
CPC G06F 16/285 (2019.01) 19 Claims
OG exemplary drawing
 
1. A system for automatically classifying data based on data usage and accessing patterns, the system comprising:
a memory device with computer-readable program code stored thereon;
at least one processing device operatively coupled to the at least one memory device and the at least one communication device, wherein executing the computer-readable code is configured to cause the at least one processing device to:
receive at least one query log comprising a plurality of data identifiers;
generate, by the at least one processing device, a data identifier total based on each data identifier of the plurality of data identifiers;
determine, by the at least one processing device, a data classification for each data identifier based on the data identifier total, wherein the data classification comprises at least one of an important classification or an unimportant classification, wherein the data classification determination comprises:
determining, based on the query log, a source identifier for each data identifier of the plurality of data identifiers,
determining, based on the query log, a target identifier for each data identifier of the plurality of data identifiers, wherein the target identifier comprises a target destination associated with the data identifier,
determining whether the source identifier and the target identifier are different for each data identifier, and
generating, in an instance where the source identifier and target identifier are different, the important classification for the data identifier;
generate, by the at least one processing device, a data catalogue comprising at least one data identifier associated with the important classification;
generate, by the at least one processing device, a wide data classification for each data identifier associated with the importance classification;
generate a wide database comprising a large volume for the data and the data identifier comprising the wide data classification, wherein the large volume comprises an attribute within the wide database for each piece of data associated with the data identifier;
automatically update, based on the wide data classification for each data identifier associated with the importance classification, the wide database with the data and the data identifier of each data identifier comprising the wide data classification, wherein the updating of the wide database comprising a storage of the data and the data identifier;
receive, at a later instance to the generation and updating of the wide database, a query comprising at least one data identifier from the plurality of data identifiers, wherein the at least one data identifier from the query is associated with the wide data classification; and
automatically collect, in response to the received query, the data of the at least one identifier from the wide database.