CPC G06F 40/58 (2020.01) [G06N 5/025 (2013.01)] | 24 Claims |
1. A system, comprising:
one or more processors; and
one or more non-transitory computer readable media to store instructions executable by the one or more processors to perform operations comprising:
parsing, using an address parser, a string comprising a set of substrings;
classifying, by a machine learning model comprising a named entity recognition model, the set of substrings into:
a set of address substrings; and
a set of non-address substrings;
mapping a non-address substring from the set of non-address substrings to a non-address component, wherein the non-address substring is excluded from a standardized address based on a jurisdiction-specific template; and
producing, as output, an address component classification for individual substrings in the set of substrings;
wherein the machine learning model further comprises a convolutional neural network configured to encode context-independent vectors into a context-sensitive sentence matrix starting with at least a 128 dimensions for individual words in the set of substrings.
|