US 11,055,403 B2
Method, system, and computer program product for application identification in a cloud platform
Peng Fei Chen, Haidian District (CN); Fan Jing Meng, Haidian District (CN); Jing Min Xu, Haidian District (CN); Lin Yang, Haidian District (CN); and Xiao Zhang, Haidian District (CN)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jan. 6, 2017, as Appl. No. 15/400,383.
Prior Publication US 2018/0196684 A1, Jul. 12, 2018
Int. Cl. G06F 21/55 (2013.01); G06F 21/53 (2013.01); G06F 9/455 (2018.01); G06F 16/14 (2019.01); G06F 8/41 (2018.01); G06F 21/64 (2013.01)
CPC G06F 21/552 (2013.01) [G06F 8/41 (2013.01); G06F 16/152 (2019.01); G06F 21/53 (2013.01); G06F 21/64 (2013.01); G06F 2009/45591 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method for application identification, the method comprising:
extracting information related to one or more processes of one or more applications running on a virtual machine from a kernel space in a memory of the virtual machine in a cloud environment, wherein the extracting identifies the running applications based on structures related to processes stored in the kernel space which includes fixed structures in terms of positions and sizes;
building at least one first application signature based on the extracted information; and
identifying the one or more applications running on the virtual machine by matching the at least one first application signature with one or more second application signatures previously stored from a database in which each of the second application signatures corresponds to an application and is used as the signature of the application,
wherein the second application signatures previously stored are obtained by compiling source codes of known applications and extracting information from compilation information generated during the compiling,
wherein the identifying identifies the application by aggregating the information collected from a native library of a host of the virtual machine, and
wherein, if two signatures are extracted from different processes of the one or more processes, keeping the two signatures as two versions of the process running at a same time,
further comprising outputting a list of the identified one or more applications running on the virtual machine,
wherein the extracted information related to one or more processes of one or more applications includes information related to opened resources of the processes including at least one of opened file and opened network connection,
further comprising performing a reduce operation on the one or more processes in parallel by using MapReduce that results in a triple for each 9f the one or more processes including the virtual machine, the application, and a version of the application,
wherein, if after the reduce operation the two triples are obtained from a same process of the one or more processes, only keeping the process with the triple of the two triples with a higher count as a correct identification result.