| CPC G06F 16/29 (2019.01) [G06F 18/214 (2023.01); G06N 20/00 (2019.01); G06Q 30/0629 (2013.01); G06Q 30/0631 (2013.01); G06Q 40/03 (2023.01); G06Q 50/16 (2013.01); G06Q 30/0207 (2013.01)] | 21 Claims |

|
1. A system comprising:
processing circuitry;
a non-transitory storage region comprising
a plurality of machine learning models, each machine learning model trained at least in part using
physical property data corresponding to a respective subset of a population of properties, and
geospatial data corresponding to geographic locations of the respective subset of the population of properties, wherein
a different respective portion of the plurality of machine learning models is configured to analyze property similarities for each respective geographic region of a plurality of geographic regions,
wherein each machine learning model is configured to produce a set of features statistically significant to property comparison in a respective geographic region of the plurality of geographic regions; and
a non-transitory computer readable memory coupled to the processing circuitry, the memory storing machine-executable instructions, wherein the machine-executable instructions, when executed on the processing circuitry, cause the processing circuitry to
generate, using the plurality of machine learning models, a plurality of feature sets, each respective feature set comprising a plurality of features determined by a respective machine learning model of the plurality of machine learning models to be statistically significant in performing comparison scoring between properties in the respective geographic region corresponding to the respective machine learning model, wherein
the plurality of features of each respective feature set comprises
a first subset of features corresponding to one or more physical property conditions of a plurality of physical property conditions, and
a second subset of features corresponding to one or more geographic location conditions of a plurality of geographic location conditions, and
generating each respective feature set comprises determining, for each respective feature of the plurality of features of the respective feature set, a respective weighting value indicating a relative importance of the respective feature to determining property similarity in the respective geographic region corresponding to the machine learning model that generated the respective feature set,
identify, based at least in part on a given geographic region of the plurality of geographic regions, a selected feature set of the plurality of feature sets applicable to a plurality of properties located in the given geographic region,
generate, from attribute data received from one or more data sources for the plurality of properties, a plurality of missing values corresponding to each of at least a subset of the plurality of features of the selected feature set for each respective property of at least a portion of the plurality of properties, wherein
the plurality of missing values are unknown data values among feature data of the plurality of properties, and
generating the missing values includes
for each respective property of at least a portion of the plurality of properties, calculating, from one or more geospatial data sets of the attribute data, at least one of a) a distance from the respective property to each point of interest of one or more points of interest, or b) a location density of a respective type of a plurality of types of each point of interest of the one or more points of interest, and
deriving, for each respective property of one or more properties of the plurality of properties, one or more values of one or more features of the plurality of features from amplifying information associated with the respective property, wherein
the one or more features are missing from variable values of the attribute data,
the amplifying information comprises at least one of image data representing the respective property or text data comprising a description of the respective property, and
deriving the one or more values comprises, for at least one property of the one or more properties, analyzing the image data and/or the text data to impute at least one feature of the one or more features,
responsive to receiving a comparable property query from a remote computing device of a user via a network, the comparable property query being related to the given geographic region, calculate, for each respective property of the plurality of properties from i) each respective feature data value of the respective property corresponding to each respective feature of the selected feature set and ii) the weighting values of the features of the selected feature set, a respective one or more similarity scores each identifying an amount of correspondence between the respective property and a queried property identified in the comparable property query, and
output, to the remote computing device of the user, a plurality of comparable property recommendations for the queried property, wherein
the plurality of comparable property recommendations represent a subset of the plurality of properties, and
the plurality of comparable property recommendations is ranked according to the respective one or more similarity scores.
|