CPC G06F 16/182 (2019.01) [G06F 16/2282 (2019.01); G06F 16/24573 (2019.01); G06F 16/285 (2019.01)] | 13 Claims |
1. A computer-implemented method comprising:
obtaining a configuration file of a distributed file system federation, wherein:
the distributed file system federation uses multiple independent subclusters;
each subcluster is independently coordinated;
a plurality of data nodes is used as common storage for blocks by all of the subclusters and each data node registers with all of the subclusters of the multiple independent subclusters; and
the configuration file comprises a list of a plurality of subclusters within the distributed file system federation and migration trigger factors for the plurality of subclusters;
determining, by one or more processors, a list of one or more source subclusters and a list of to-be-migrated directories in the one or more source subclusters based on a scanning result of the plurality of subclusters and the migration trigger factors in the configuration file; and
migrating data from the one or more source subclusters to one or more target subclusters according to a generated migration plan, wherein the migration plan is generated, with linear programming, to migrate the to-be-migrated directories from the one or more source subclusters to the one or more target subclusters in the distributed file system federation, wherein:
the linear programming includes an objective function of maximizing a number of subclusters meeting an expected capacity usage after migration;
the objective function is subject to a pruning constraint; and
the pruning constraint is that a total amount of metadata, comprising a number of files and directories, to be migrated from a source subcluster to a target subcluster is less than a constant number as determined from the migration trigger factors in the configuration file.
|