CPC G06Q 30/0201 (2013.01) [G06F 16/285 (2019.01); G06N 3/044 (2023.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01); G06Q 40/04 (2013.01); G06Q 40/06 (2013.01)] | 23 Claims |
1. A computer system comprising:
a processor;
a tangible computer-readable medium containing computer-executable instructions that when executed by the processor cause the processor to:
receive, from a client computer via an electronic communication network, a data set comprising a plurality of data records each including data indicative of a time stamp, a level, and a quantity, the data set characterized by a first size;
determine, for each time stamp of each data record of the data set, a difference in the quantity at each level when compared to the quantity of the data record comprising data indicative of the same level at a prior time stamp;
arrange the data set into a sequence of time period windows of a selected adjustable length sufficient to encompass one of a pattern or structure within the data set;
determine quantiles for changes in the quantities;
divide the determined differences into predefined portions, each of which is characterized by one of a plurality of categories, each category being assigned to the time period window in accordance with the predefined portions and the determined quantiles;
generate a new pre-processed data set comprising the sequence of time period windows, wherein each data record of the new pre-processed data set includes a vector encoding of the plurality of categories representative of each price level and time therein, the new pre-processed data set characterized by a second size less than the first size; and
transmit the new pre-processed data set as input to a computer system, wherein, upon receipt of the new processed data set, the computer system executes a machine learning algorithm, wherein the execution of the machine learning algorithm includes training a recurrent neural network to identify the structure in the new pre-processed data set and executing a lossy encoded compression to compress the sequence of time period windows to provide a feature mapping from the sequence of time period windows to a feature space, wherein the lossy encoded compression of the sequence removes noise from the sequence of time period windows while retaining unique features of the feature space.
|