US 11,790,016 B2
Method, device and computer program for collecting data from multi-domain
Sang Duk Suh, Gyeonggi-do (KR); Changhoon Yoon, Gyeonggi-do (KR); and Seung Hyeon Lee, Daejeon (KR)
Assigned to S2W INC.
Appl. No. 17/431,697
Filed by S2W LAB INC., Gyeonggi-do (KR)
PCT Filed Jan. 30, 2020, PCT No. PCT/KR2020/001382
§ 371(c)(1), (2) Date Aug. 17, 2021,
PCT Pub. No. WO2020/171410, PCT Pub. Date Aug. 27, 2020.
Claims priority of application No. 10-2019-0019087 (KR), filed on Feb. 19, 2019.
Prior Publication US 2022/0138271 A1, May 5, 2022
Int. Cl. G06F 16/00 (2019.01); G06F 16/951 (2019.01); H04L 9/40 (2022.01)
CPC G06F 16/951 (2019.01) [H04L 63/1425 (2013.01)] 6 Claims
OG exemplary drawing
 
1. A method for collecting data in a data collection device, comprising:
a step A of collecting data using a distributed crawler from a dark web site belonging to a network where channels are established by randomly connecting at least one or more network nodes that perform network routing functions, the dark web being not accessible with a general web browser and being assessable with preset specific software; and
a step B of standardizing the collected data in a preset format and generating metadata for the collected data,
wherein the step A includes
collecting domain information of the network;
identifying whether collected domains have been changed, and preferentially allocating a domain which is identified as being most recently registered to the distributed crawler; and
operating a plurality of network nodes that perform the routing function and collecting data from the dark web corresponding to an arbitrary domain by processing a request of the distributed crawler in the network nodes.