mindspore.parallel.convert_checkpoints

View Source On Gitee
mindspore.parallel.convert_checkpoints(src_checkpoints_dir, dst_checkpoints_dir, ckpt_prefix, src_strategy_file=None, dst_strategy_file=None, process_num=1, output_format='ckpt')[source]

Convert distributed checkpoint from source sharding strategy to destination sharding strategy for a rank.

Note

The src_checkpoints_dir directory structure should be organized like "src_checkpoints_dir/rank_0/a.ckpt", the rank number should be set to a subdirectory and the checkpoint file is stored in this subdirectory. If multiple files exist in a rank directory, the last file in the lexicgraphic order would be selected.

The number of multiprocess settings is related to the size of the host, and it is not recommended to set it too large, otherwise it may cause freezing.

Parameters
  • src_checkpoints_dir (str) – The source checkpoints directory.

  • dst_checkpoints_dir (str) – The destination checkpoints directory to save the converted checkpoints.

  • ckpt_prefix (str) – The destination checkpoint name prefix.

  • src_strategy_file (str, optional) – Name of source sharding strategy file which saved by 'mindspore.parallel.auto_parallel.AutoParallel(cell).save_param_strategy_file(file_path)'. when the 'src_strategy_file' is None, it means that the source sharding strategy is without any sharing for each parameter. Default:None.

  • dst_strategy_file (str, optional) – Name of destination sharding strategy file which saved by 'mindspore.parallel.auto_parallel.AutoParallel(cell).save_param_strategy_file(file_path)'. when the 'dst_strategy_file' is None, it means that the destination sharding strategy is without any sharing for each parameter. Default:None.

  • process_num (int, optional) – Number of processes to use for parallel processing. Defaults: 1.

  • output_format (str, optional) – Control the format of the output checkpoint after conversion. It can be set to either "ckpt" or "safetensors". Default: "ckpt".

Raises
  • ValueErrorsrc_strategy_file or dst_strategy_file is incorrect.

  • NotADirectoryErrorsrc_checkpoints_dir or dst_checkpoints_dir is not a directory.

  • ValueError – The checkpoint file is missing in src_checkpoints_dir.

  • TypeErrorsrc_strategy_file or dst_strategy_file is not a string.

Supported Platforms:

Ascend

Examples

>>> from mindspore.parallel import convert_checkpoints
>>> convert_checkpoints(src_checkpoints_dir, dst_checkpoints_dir, "dst_checkpoint",
...                       "./src_strategy.ckpt", "./dst_strategy.ckpt")