| CPC G06V 10/774 (2022.01) [G06V 10/764 (2022.01); G06V 10/776 (2022.01); G06V 20/10 (2022.01)] | 20 Claims |

|
1. A computer-implemented method of processing image data for use in training a machine learning system for classifying image data, the method comprising:
obtaining image data comprising a plurality of images, each image corresponding to a respective geographic area;
identifying a plurality of samples for each image, wherein each sample comprises a portion of said image;
identifying one or more classes of topographic feature contained within each sample, wherein the identifying comprises determining a quantity of each respective topographic feature class contained therein; and
applying a simulated annealing algorithm to the plurality of samples, wherein the simulated annealing algorithm iteratively selects candidate subsets of samples from the plurality of samples for each image, and scores the candidate subsets of samples according to the quantities of the topographic feature classes contained therein, to thereby identify an optimal subset of samples from each image such that a balanced distribution of the topographic feature classes is obtained.
|