mindspore.mint.distributed.gather

View Source On Gitee
mindspore.mint.distributed.gather(tensor, gather_list, dst=0, group=None, async_op=False)[source]

Gathers tensors from the specified communication group. The operation will gather the tensor from processes according to dimension 0.

Note

  • Only the tensor in process dst (global rank) will keep the gathered tensor. The other process will keep a tensor list which has no mathematical meaning.

  • The tensors must have the same shape and format in all processes of the collection.

  • Only support PyNative mode, Graph mode is not currently supported.

Parameters
  • tensor (Tensor) – The tensor to be gathered.

  • gather_list (list[Tensor]) – List of same-sized tensors to use for gathered data.

  • dst (int, optional) – Specifies the rank(global rank) of the process that receive the tensor. And only process dst will receive the gathered tensor. Default: 0 .

  • group (str, optional) – The communication group to work on. If None, which means "hccl_world_group" in Ascend. Default: None.

  • async_op (bool, optional) – Whether this operator should be an async operator. Default: False .

Returns

CommHandle, CommHandle is an async work handle, if async_op is set to True. CommHandle will be None, when async_op is False.

Raises
  • TypeError – If the type of input tensor is not Tensor, or gather_list is not Tensor list.

  • TypeError – If dst is not an integer, group is not a string or async_op is not bool.

  • TypeError – If size of gather_list is not equal to group size.

  • TypeError – If the type or shape of tensor not equal to the member of gather_list.

  • RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails.

Supported Platforms:

Ascend

Examples

Note

Before running the following examples, you need to configure the communication environment variables.

For Ascend devices, it is recommended to use the msrun startup method without any third-party or configuration file dependencies. Please see the msrun start up for more details.

This example should be run with 2 devices.

>>> import numpy as np
>>> import mindspore as ms
>>> import mindspore.nn as nn
>>> from mindspore.mint.distributed import init_process_group, gather
>>> from mindspore import Tensor
>>> # Launch 2 processes.
>>> init_process_group()
>>> input = Tensor(np.arange(4).reshape([2, 2]).astype(np.float32))
>>> outputs = [Tensor(np.zeros([2, 2]).astype(np.float32)),Tensor(np.zeros([2, 2]).astype(np.float32))]
>>> gather(input, outputs, dst=0)
>>> print(outputs)
# rank_0
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00],
 [ 2.00000000e+00,  3.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00], [ 2.00000000e+00,  3.00000000e+00]])]
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00],
 [ 2.00000000e+00,  3.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00], [ 2.00000000e+00,  3.00000000e+00]])]
# rank_1
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00],
 [ 0.00000000e+00,  0.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00], [ 0.00000000e+00,  0.00000000e+00]])]
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00],
 [ 0.00000000e+00,  0.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00], [ 0.00000000e+00,  0.00000000e+00]])]