| CPC G06F 16/24578 (2019.01) [G06F 16/2468 (2019.01); G06F 16/29 (2019.01); G06N 3/08 (2013.01); G06N 3/105 (2013.01)] | 20 Claims |

|
1. A method comprising:
receiving an address string;
generating parsed address data based on the address string;
wherein the parsed address data comprises a plurality of subfield values, said plurality of subfield values including at least one street-level subfield value and at least one non-street level subfield value;
using a global toponym database, identifying a plurality of field candidates for a subset of the plurality of subfield values of the parsed address data;
generating a plurality of address tuples based on the plurality of field candidates;
wherein the plurality of address tuples comprises combinations of field candidates of the plurality of field candidates;
generating a plurality of queries for the global toponym database based on the plurality of address tuples;
providing, to the global toponym database, at least one first query of the plurality of queries corresponding to at least one first address tuple of the plurality of address tuples;
in response to the global toponym database failing to identify a canonical street locale for the at least one first address tuple, providing, to the global toponym database, a second query of the plurality of queries corresponding to a second address tuple of the plurality of address tuples;
receiving, from the global toponym database, one or more standardized street locales, each of said one or more standardized street locales being an actual non-street level address associated with the second address tuple;
generating a query for a global address database to: exactly match one or more non-street-level subfield values included in a particular standardized street locale of said one or more standardized street locales and fuzzy match a street-level value of said at least one street-level subfield value, the fuzzy match excluding said one or more non-street-level subfield values;
receiving, from the global address database, a set of candidate addresses that are identical to one of said one or more standardized street locales on a non-street level;
generating a score for each candidate address of the set of candidate addresses, each score of said each candidate address comprising a probability that said each candidate address represent a same place or location as a combination of said at least one street level subfield and the respective standardized street locale of said each candidate address.
|