| CPC G06Q 30/0204 (2013.01) [G06N 20/20 (2019.01); G06Q 10/06315 (2013.01); G06Q 10/06393 (2013.01)] | 22 Claims |

|
1. A method for segmenting a large dataset into distinct segments using artificial intelligence (AI), the method performed by at least one processor comprising hardware, the method comprising:
receiving aggregated datasets comprising user data and user IDs assigned thereto from multiple data sources, the user data comprising demographic data, behavioral data, and transactional data for given users;
processing the datasets by denoising and feature-learning the datasets by a neural network-based denoising autoencoder to reduce dimensionality, enhance quality of the user data, and enhance segmentation accuracy;
extracting user data characteristics from the processed datasets;
creating distinct segments according to a segmentation pipeline based on the extracted user data characteristics;
assigning users membership into given ones of the distinct segments according to an ensemble machine learning-based segmentation model and the extracted user data characteristics, wherein the ensemble machine learning-based segmentation model integrates multiple clustering algorithms, the multiple clustering algorithms comprising k-means clustering, hierarchical clustering, and density-based clustering;
quantifying, by an explainability module using game theory and Shapley values, an importance of each of the extracted user data characteristics in creating the distinct segments and assigning users membership into the distinct segments;
translating, by an explainability module using a large language model, into plain English the importance of each of the user data characteristics in determining segment membership;
storing the segmentation model, distinct segments and explanation in a database;
receiving additional user data;
continuously refining the segmentation model according to the additional user data to improve precision;
updating a set of the distinct segments according to the refined segmentation model;
generating marketing recommendations and key insights for each one of the distinct segments, the key insights comprising comparative and numerical information representative of each one of the distinct segments; and
creating a plurality of detailed personas based on the user data characteristics and key insights, each one of the plurality of detailed personas comprising digital usage data representative of a given distinct segment.
|