US 11,914,657 B2
Machine learning aided automatic taxonomy for web data
Rohit Kewalramani, Pune (IN); and Justin Chien, Campbell, CA (US)
Assigned to 6SENSE INSIGHTS, INC., San Francisco, CA (US)
Filed by 6SENSE INSIGHTS, INC., San Francisco, CA (US)
Filed on May 31, 2022, as Appl. No. 17/828,910.
Claims priority of provisional application 63/195,566, filed on Jun. 1, 2021.
Prior Publication US 2022/0391453 A1, Dec. 8, 2022
Int. Cl. G06F 16/00 (2019.01); G06F 16/951 (2019.01); G06N 3/04 (2023.01); G06N 3/082 (2023.01); G06F 40/205 (2020.01)
CPC G06F 16/951 (2019.01) [G06F 40/205 (2020.01); G06N 3/04 (2013.01); G06N 3/082 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising using at least one hardware processor to:
during a training mode, train a model to predict a class, from a plurality of classes in a taxonomy of web-based activities, based on a training dataset that comprises a plurality of annotated features, wherein each of the plurality of annotated features comprises one or more features, which have been derived from a uniform resource locator (URL) of an online resource and metadata associated with that online resource, and a ground-truth class assigned to those one or more features; and,
during an operation mode,
acquire web data comprising one or more activity records, wherein each of the one or more activity records comprises a URL of an online resource that was accessed by a visitor, and metadata associated with that online resource, and,
for each of the one or more activity records,
extract a set of one or more features from the URL and the metadata in the activity record,
apply the trained model to the set of one or more features to predict a class, from the plurality of classes in the taxonomy, that is associated with the set of one or more features, and
store the predicted class in association with the URL in the activity record.