mindspore.mint.distributed.all_to_all_single

View Source On Gitee
mindspore.mint.distributed.all_to_all_single(output, input, output_split_sizes=None, input_split_sizes=None, group=None, async_op=False)[source]

scatter and gather input with split size to/from all rank, and return result in a single tensor.

Note

  • 'output' and 'tensor' shape should be match across ranks.

  • Only support PyNative mode, Graph mode is not currently supported.

Parameters
  • output (Tensor) – the output tensor is gathered concatenated from remote ranks.

  • input (Tensor) – tensor to be scattered to remote rank.

  • output_split_sizes (Union(Tuple(int), List(int)), optional) – output split size at dim 0. If set to None, it means equally split by world_size. Default: None.

  • input_split_sizes (Union(Tuple(int), List(int)), optional) – input split size at dim 0. If set to None, it means equally split by world_size. Default: None.

  • group (str, optional) – The communication group to work on. If None, which means "hccl_world_group" in Ascend. Default: None.

  • async_op (bool, optional) – Whether this operator should be an async operator. Default: False .

Returns

CommHandle, CommHandle is an async work handle, if async_op is set to True. CommHandle will be None, when async_op is False.

Raises
  • TypeError – If input or output is not tensor. group is not a str, or async_op is not bool.

  • ValueError – When input_split_sizes is empty, input dim 0 can not be divided by world_size.

  • ValueError – When output_split_sizes is empty, output dim 0 can not be divided by world_size.

Supported Platforms:

Ascend

Examples

Note

Before running the following examples, you need to configure the communication environment variables.

For Ascend devices, it is recommended to use the msrun startup method without any third-party or configuration file dependencies. Please see the msrun start up for more details.

This example should be run with 2 devices.

>>> import numpy as np
>>> import mindspore
>>> from mindspore.mint.distributed import init_process_group, get_rank
>>> from mindspore.mint.distributed import all_to_all_single
>>> from mindspore import Tensor
>>> from mindspore.ops import zeros
>>>
>>> init_process_group()
>>> this_rank = get_rank()
>>> if this_rank == 0:
>>>     output = Tensor(np.zeros([3, 3]).astype(np.float32))
>>>     tensor = Tensor([[0, 1, 2.], [3, 4, 5], [6, 7, 8]])
>>>     result = all_to_all_single(output, tensor, [2, 1], [2, 1])
>>>     print(output)
>>> if this_rank == 1:
>>>     output = Tensor(np.zeros([2, 3]).astype(np.float32))
>>>     tensor = Tensor([[9, 10., 11], [12, 13, 14]])
>>>     result = all_to_all_single(output, tensor, [1, 1], [1, 1])
>>>     print(output)
rank 0:
[[ 0.  1.  2.]
 [ 3.  4.  5.]
 [ 9. 10. 11.]]
rank 1:
[[ 6.  7.  8.]
 [12. 13. 14.]]