US 12,314,274 B2
	Bulletin board data mapping and presentation
Greg Bolcer, Yorba Linda, CA (US); John Petrocik, Irvine, CA (US); Alan Chaney, Simi Valley, CA (US); Nirmisha Bollampalli, Irvine, CA (US); Andrey Mogilev, Novosibirsk (RU); and Kevin Watters, Boston, MA (US)
Assigned to Bitvore Corp., Los Angeles, CA (US)
Filed by Bitvore Corp., Los Angeles, CA (US)
Filed on May 10, 2023, as Appl. No. 18/195,658.
Application 18/195,658 is a continuation of application No. 17/315,626, filed on May 10, 2021, granted, now 11,698,909.
Application 17/315,626 is a continuation of application No. 16/573,320, filed on Sep. 17, 2019, granted, now 11,048,710, issued on Jun. 29, 2021.
Application 16/573,320 is a continuation of application No. 14/855,290, filed on Sep. 15, 2015, granted, now 10,423,628, issued on Sep. 24, 2019.
Application 14/855,290 is a continuation in part of application No. 14/678,762, filed on Apr. 3, 2015, granted, now 11,599,589, issued on Mar. 7, 2023.
Application 14/678,762 is a continuation of application No. 13/214,053, filed on Aug. 19, 2011, granted, now 9,015,244, issued on Apr. 21, 2015.
Claims priority of provisional application 61/375,414, filed on Aug. 20, 2010.
Prior Publication US 2023/0273929 A1, Aug. 31, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/00 (2019.01); G06F 16/2457 (2019.01); G06F 16/248 (2019.01); G06F 16/9535 (2019.01); G06Q 10/101 (2023.01); G06Q 10/107 (2023.01); G06Q 50/00 (2012.01); G06V 30/416 (2022.01); H04L 51/216 (2022.01)

CPC G06F 16/24578 (2019.01) [G06F 16/248 (2019.01); G06F 16/9535 (2019.01); G06Q 10/101 (2013.01); G06Q 10/107 (2013.01); G06Q 50/01 (2013.01); G06V 30/416 (2022.01); H04L 51/216 (2022.05)]

10 Claims

1. A computer system for file analysis, the system comprising:

a memory comprising instructions executable by one or more processors, wherein the one or more processors are operable to execute the instructions to:

control a hardware engine, comprising an entity extractor and a similarity engine, operatively coupled via a graphics bus to accelerate identification and analysis of large datasets in real-time, wherein the hardware engine is configured to:

identify one or more relevant files according to one or more semantic concepts;

identify one or more words for each of the one or more relevant files;

identify one or more n-grams according to the one or more words identified in the one or more relevant files, wherein an n-gram is one or more combinations of the one or more words;

generate a plurality of first scores, wherein each first score of the plurality of first scores is generated according to a term frequency and a global document frequency for each of the one or more words of each of the one or more n-grams of each of the one or more relevant files;

perform vector analysis on the one or more relevant files to generate a model document that improves file classification accuracy and reduces computational complexity by efficiently identifying unknown files according to similarities to relevant files, in order to assign each unknown files a relevant score;

generate a document vector according to averages of the plurality of first scores, wherein the document vector represents a reduced-dimensional representation of the file that increases the speed and accuracy of comparison between files and comprises a final value that illustrates how valuable the one or more words are in the one or more unknown files;

compare the unknown file with the model document according to the term frequency and the global document frequency; and

assign the relevant score to the unknown file according to the comparison, wherein the relevant score is used in a practical application comprising one or more of a technical space, a conceptual field, a geographic location and an industry sector, thereby enhancing the speed and efficiency of file retrieval and classification in such environments.