US 12,443,798 B2
Methods and systems for preparing unstructured data for statistical analysis using electronic characters
Forrestt Severtson, Alpharetta, GA (US)
Assigned to STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY, Bloomington, IL (US)
Filed by STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY, Bloomington, IL (US)
Filed on Jun. 21, 2022, as Appl. No. 17/845,689.
Claims priority of provisional application 63/214,097, filed on Jun. 23, 2021.
Prior Publication US 2023/0023636 A1, Jan. 26, 2023
Int. Cl. G06F 40/295 (2020.01); G06N 5/022 (2023.01)
CPC G06F 40/295 (2020.01) [G06N 5/022 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for preparing unstructured data for machine learning analysis, the method comprising:
receiving, by one or more processors, data representing a plurality of processes;
analyzing, by the one or more processors, the data to identify, for each process of the plurality of processes, a time-ordered sequence of events that occurred during the process;
generating, by the one or processors, a plurality of emoji sequences by, for each process of the plurality of processes, generating an emoji sequence, each emoji in the emoji sequence representing an event of the events that occurred during the process, and the emoji sequence ordered in accordance with the time-ordered sequence;
generating, by the one or more processors, graphical representations of the plurality of emoji character sequences, the graphical representations including information about the order in which emojis occur in each emoji sequence;
generating, by the one or more processors, a plurality of feature vectors corresponding to the respective plurality of emoji sequences, the plurality of feature vectors including information based on the graphical representations; and
applying, by the one or more processors, a machine learning technique to the plurality of feature vectors.