US 12,141,098 B2
Content hiding software identification and/or extraction system and method
Gokila Dorai, Tallahassee, FL (US); Sudhir Aggarwal, Tallahassee, FL (US); Charisa Powell, Tallahassee, FL (US); and Neet Patel, Tallahassee, FL (US)
Assigned to The Florida State University Research Foundation, Inc., Tallahassee, FL (US)
Filed by The Florida State University Research Foundation, Inc., Tallahassee, FL (US)
Filed on Jun. 30, 2022, as Appl. No. 17/854,793.
Claims priority of provisional application 63/216,767, filed on Jun. 30, 2021.
Prior Publication US 2023/0012801 A1, Jan. 19, 2023
Int. Cl. G06F 16/00 (2019.01); G06F 16/14 (2019.01); G06F 18/23 (2023.01); G06F 30/27 (2020.01)
CPC G06F 16/156 (2019.01) [G06F 18/23 (2023.01); G06F 30/27 (2020.01)] 17 Claims
OG exemplary drawing
 
1. A method to identify content hiding software, the method comprising:
pre-identifying content hiding software in curation platforms, wherein pre-identifying comprises:
scanning titles and/or subtitles in a curation platform using a first set of keywords;
returning title, subtitle, and bundle identifiers of a plurality of potential content hiding software;
classifying, via one or more trained machine learning classifiers, a software from the plurality of potential content hiding software as content hiding software or non-content hiding software, wherein the one or more classifiers have been trained using a feature set comprising a second set of keywords associated with a title, subtitle, and/or bundle identifiers of software, wherein the first set of key words is smaller than the second set of keywords; and
storing the classified content hiding software in a database as pre-identified content hiding software;
acquiring, from a user device or a remote computing device operatively coupled to the user device, configuration data of the user device;
parsing the configuration data into a list of installed software and respective bundle identifiers; and
identifying installed software on the user device as a content hiding software by (i) comparing the list of installed software to pre-identified content hiding software in the database and (ii) extracting hidden information from artifacts of stored data of the installed software on the user device.