| CPC G06Q 30/0206 (2013.01) [G06Q 30/0203 (2013.01); G06Q 30/0205 (2013.01)] | 23 Claims |

|
1. Methodology for the collection of data required to study the extent to which pricing algorithms in internet markets may induce disparities across demographic consumer groups, without collecting or storing the demographic data of the internet market consumers, comprising:
providing a first internet crawler comprising one or more processors programmed to:
open a targeted retailer's main web page and then recursively crawl through all products in at least one selected category of products,
build a tree of all products available in the at least one selected category, and
collect and store the https addresses of the product pages of all products in the at least one selected category; and
providing a second internet crawler comprising one or more processors programmed to:
receive the collected https addresses of the product pages from the first internet crawler,
collect and store pricing data from a relatively large number of a plurality of locations of the targeted retailer, including from a first of the plurality of locations of the targeted retailer,
close an associated browser,
delete its browsing history and cookies,
recursively perform such collect, store, close, and delete sequencing for each of the remainder of the plurality of locations of the targeted retailer, for the stored pricing data associated with the respective plurality of locations of the targeted retailer to be subsequently analyzed for disparate impact (DI), to detect pricing algorithm bias while protecting consumer privacy, and
collect pricing data across multiple locations, using independent threads which are run asynchronously in parallel, with each thread based on a randomly selected focal zip code from all of the zip codes and its identified neighboring zip codes, so that for each focal zip code a new browser session is created using a browser selected at random; and
wherein the one or more processors comprising the first and second internet crawlers are further programmed to operate a plurality of browsing sessions run in parallel using multi-threading.
|