mindspore.communication
Collective communication interface.
- mindspore.communication.create_group(group, rank_ids)[source]
Create a user collective communication group.
Note
GPU version of MindSpore doesn’t support this method. The size of rank_ids should be larger than 1. Rank_ids should not have duplicate data. This method should be used after init().
- Parameters
- Raises
TypeError – If group is not a string or rank_ids is not a list.
ValueError – If rank_ids size is not larger than 1, or rank_ids has duplicate data, or backend is invalid.
RuntimeError – If hccl/nccl is not available or MindSpore is GPU version..
Examples
>>> from mindspore.context import set_context >>> set_context(device_target="Ascend") >>> init() >>> group = "0-8" >>> rank_ids = [0,8] >>> create_group(group, rank_ids)
- mindspore.communication.destroy_group(group)[source]
Destroy the user collective communication group.
Note
GPU version of MindSpore doesn’t support this method. The parameter group should not be “hccl_world_group”. This method should be used after init().
- Parameters
group (str) – The communication group to destroy, the group should be created by create_group.
- Raises
TypeError – If group is not a string.
ValueError – If group is “hccl_world_group” or backend is invalid.
RuntimeError – If HCCL/NCCL is not available or MindSpore is GPU version.
- mindspore.communication.get_group_rank_from_world_rank(world_rank_id, group)[source]
Get the rank ID in the specified user communication group corresponding to the rank ID in the world communication group.
Note
GPU version of MindSpore doesn’t support this method. The parameter group should not be “hccl_world_group”. This method should be used after init().
- Parameters
- Returns
int, the rank ID in the user communication group.
- Raises
TypeError – If world_rank_id is not an integer or the group is not a string.
ValueError – If group is ‘hccl_world_group’ or backend is invalid.
RuntimeError – If HCCL/NCCL is not available or MindSpore is GPU version.
Examples
>>> from mindspore.context import set_context >>> set_context(device_target="Ascend") >>> init() >>> group = "0-4" >>> rank_ids = [0,4] >>> create_group(group, rank_ids) >>> group_rank_id = get_group_rank_from_world_rank(4, group) >>> print("group_rank_id is: ", group_rank_id) # group_rank_id is: 1
- mindspore.communication.get_group_size(group=GlobalComm.WORLD_COMM_GROUP)[source]
Get the rank size of the specified collective communication group.
Note
This method should be used after init().
- Parameters
group (str) – The communication group to work on. Normally, the group should be created by create_group, otherwise, using the default group.
Default – WORLD_COMM_GROUP.
- Returns
int, the rank size of the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If HCCL/NCCL is not available.
- mindspore.communication.get_local_rank(group=GlobalComm.WORLD_COMM_GROUP)[source]
Gets local rank ID for current device in specified collective communication group.
Note
GPU version of MindSpore doesn’t support this method. This method should be used after init().
- Parameters
group (str) – The communication group to work on. Normally, the group should be created by create_group, otherwise, using the default group.
Default – WORLD_COMM_GROUP.
- Returns
int, the local rank ID of the calling process within the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If HCCL/NCCL is not available or MindSpore is GPU version.
- mindspore.communication.get_local_rank_size(group=GlobalComm.WORLD_COMM_GROUP)[source]
Gets local rank size of the specified collective communication group.
Note
GPU version of MindSpore doesn’t support this method. This method should be used after init().
- Parameters
group (str) – The communication group to work on. The group is created by create_group or the default world communication group.
- Returns
int, the local rank size where the calling process is within the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If HCCL/NCCL is not available or MindSpore is GPU version.
- mindspore.communication.get_rank(group=GlobalComm.WORLD_COMM_GROUP)[source]
Get the rank ID for the current device in the specified collective communication group.
Note
This method should be used after init().
- Parameters
group (str) – The communication group to work on. Normally, the group should be created by create_group, otherwise, using the default group.
Default – WORLD_COMM_GROUP.
- Returns
int, the rank ID of the calling process within the group.
- Raises
TypeError – If group is not a string.
ValueError – If backend is invalid.
RuntimeError – If HCCL/NCCL is not available.
- mindspore.communication.get_world_rank_from_group_rank(group, group_rank_id)[source]
Get the rank ID in the world communication group corresponding to the rank ID in the specified user communication group.
Note
GPU version of MindSpore doesn’t support this method. The parameter group should not be “hccl_world_group”. This method should be used after init().
- Parameters
- Returns
int, the rank ID in world communication group.
- Raises
TypeError – If group_rank_id is not an integer or the group is not a string.
ValueError – If group is ‘hccl_world_group’ or backend is invalid.
RuntimeError – If HCCL/NCCL is not available or MindSpore is GPU version.
Examples
>>> from mindspore.context import set_context >>> set_context(device_target="Ascend") >>> init() >>> group = "0-4" >>> rank_ids = [0,4] >>> create_group(group, rank_ids) >>> world_rank_id = get_world_rank_from_group_rank(group, 1) >>> print("world_rank_id is: ", world_rank_id) # world_rank_id is: 4
- mindspore.communication.init(backend_name=None)[source]
Initialize distributed backend, e.g. HCCL/NCCL, it is required before using the communication service.
Note
The full name of HCCL is Huawei Collective Communication Library. The full name of NCCL is NVIDIA Collective Communication Library. This method should be used after set_context.
- Parameters
backend_name (str) – Backend, using HCCL/NCCL. if not been set, infer it by device_target. Default: None.
- Raises
TypeError – If backend_name is not a string.
RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails, or the environment variables RANK_ID/MINDSPORE_HCCL_CONFIG_PATH have not been exported when backend is HCCL.
ValueError – If the environment variable RANK_ID has not been exported as a number.
Examples
>>> from mindspore.context import set_context >>> set_context(device_target="Ascend") >>> init()
- mindspore.communication.release()[source]
Release distributed resource. e.g. HCCL/NCCL.
Note
This method should be used after init().
- Raises
RuntimeError – If failed to release distributed resource.