US 11,868,311 B2
Efficient similarity detection
Gabrielle Burns, Santa Clara, CA (US); and Yuping He, San Jose, CA (US)
Assigned to AWES.ME, INC., Mountain View, CA (US)
Filed by AWES.ME, INC., Mountain View, CA (US)
Filed on Sep. 28, 2020, as Appl. No. 17/035,195.
Application 17/035,195 is a continuation of application No. 15/876,652, filed on Jan. 22, 2018, granted, now 10,803,013, issued on Oct. 13, 2020.
Claims priority of provisional application 62/539,963, filed on Aug. 1, 2017.
Claims priority of provisional application 62/457,724, filed on Feb. 10, 2017.
Prior Publication US 2021/0011883 A1, Jan. 14, 2021
Int. Cl. G06F 16/10 (2019.01); G06F 16/13 (2019.01); H04L 67/1097 (2022.01); H04L 67/06 (2022.01); G06F 16/14 (2019.01); G06F 16/16 (2019.01); G06F 16/182 (2019.01); H04L 67/10 (2022.01); G06F 16/28 (2019.01); G06F 16/93 (2019.01); H04L 67/01 (2022.01)
CPC G06F 16/137 (2019.01) [G06F 16/152 (2019.01); G06F 16/168 (2019.01); G06F 16/183 (2019.01); G06F 16/285 (2019.01); G06F 16/93 (2019.01); H04L 67/01 (2022.05); H04L 67/06 (2013.01); H04L 67/10 (2013.01); H04L 67/1097 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for detecting similar files, the method comprising:
receiving, at a processor, a request from a user device to upload a file to a server;
extracting file information comprising at least a filename, a file size, and metadata from the file with an upload client, wherein the metadata includes information regarding creation of the file separate from content of the file;
generating, by the server, a file signature for the file based on at least the filename, the file size, and the metadata, wherein the file signature is different from a hash signature;
accessing one or more existing file signatures for each of one or more existing files stored on the server;
comparing the one or more existing file signatures to the file signature;
accessing a first hash signature for the existing file corresponding to the existing file signature;
upon determining that the file signature is within a predetermined deviation from one of the existing file signatures based on the comparison, generating a second hash signature for the file corresponding to the file signature; and
storing the file to the server responsive to determining that the first hash signature does not equal the second hash signature.