mindspore.communication

Collective communication interface.

class mindspore.communication.GlobalComm[source]

World communication information.

mindspore.communication.create_group(group, rank_ids)[source]

Create a user collective communication group.

Note

NCCL is not supported. The size of rank_ids should be larger than 1. Rank_ids should not have duplicate data.

Parameters
  • group (str) – ProcessGroup, the process group to create.

  • rank_ids (list) – A list of device IDs.

Raises
  • TypeError – If group is not a string or rank_ids is not a list.

  • ValueError – If rank_ids size is not larger than 1, or rank_ids has duplicate data, or backend is invalid.

  • RuntimeError – If hccl/nccl is not available or nccl not supports.

Examples

>>> group = "0-1"
>>> rank_ids = [0,1]
>>> create_group(group, rank_ids)
mindspore.communication.destroy_group(group)[source]

Destroy the user collective communication group.

Note

Nccl is not supported. The parameter group should not be “hccl_world_group”.

Parameters

group (str) – ProcessGroup, the process group to destroy.

Raises
  • TypeError – If group is not a string.

  • ValueError – If group is “hccl_world_group” or backend is invalid.

  • RuntimeError – If HCCL/NCCL is not available or NCCL is not supported.

mindspore.communication.get_group_rank_from_world_rank(world_rank_id, group)[source]

Get the rank ID in the specified user communication group corresponding to the rank ID in the world communication group.

Note

NCCL is not supported. The parameter group should not be “hccl_world_group”.

Parameters
  • world_rank_id (int) – A rank ID in the world communication group.

  • group (str) – The user communication group.

Returns

int, the rank ID in the user communication group.

Raises
  • TypeError – If world_rank_id is not an integer or the group is not a string.

  • ValueError – If group is ‘hccl_world_group’ or backend is invalid.

  • RuntimeError – If HCCL/NCCL is not available or NCCL is not supported.

mindspore.communication.get_group_size(group='hccl_world_group')[source]

Get the rank size of the specified collective communication group.

Parameters

group (str) – ProcessGroup, the process group to work on.

Returns

int, the rank size of the group.

Raises
mindspore.communication.get_local_rank(group='hccl_world_group')[source]

Gets local rank ID for current device in specified collective communication group.

Note

Nccl is not supported.

Parameters

group (str) – ProcessGroup, the process group to work on. Default: WORLD_COMM_GROUP.

Returns

int, the local rank ID of the calling process within the group.

Raises
mindspore.communication.get_local_rank_size(group='hccl_world_group')[source]

Gets local rank size of the specified collective communication group.

Note

Nccl is not supported.

Parameters

group (str) – ProcessGroup, the process group to work on.

Returns

int, the local rank size where the calling process is within the group.

Raises
mindspore.communication.get_rank(group='hccl_world_group')[source]

Get the rank ID for the current device in the specified collective communication group.

Parameters

group (str) – ProcessGroup, the process group to work on. Default: WORLD_COMM_GROUP.

Returns

int, the rank ID of the calling process within the group.

Raises
mindspore.communication.get_world_rank_from_group_rank(group, group_rank_id)[source]

Gets the rank ID in the world communication group corresponding to the rank ID in the specified user communication group.

Note

NCCL is not supported. The parameter group should not be “hccl_world_group”.

Parameters
  • group (str) – The user communication group.

  • group_rank_id (int) – A rank ID in user communication group.

Returns

int, the rank ID in world communication group.

Raises
  • TypeError – If group_rank_id is not an integer or the group is not a string.

  • ValueError – If group is ‘hccl_world_group’ or backend is invalid.

  • RuntimeError – If HCCL/NCCL is not available or NCCL is not supported.

mindspore.communication.init(backend_name=None)[source]

Initialize distributed backend, e.g. HCCL/NCCL, it is required before using the communication service.

Note

The full name of HCCL is Huawei Collective Communication Library. The full name of NCCL is NVIDIA Collective Communication Library.

Parameters

backend_name (str) – Backend.

Raises
  • TypeError – If backend_name is not a string.

  • RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails.

mindspore.communication.release()[source]

Release distributed resource. e.g. HCCL/NCCL.

Raises

RuntimeError – If failed to release distributed resource.