
mindspore.mint.distributed.gather(tensor, gather_list, dst=0, group=None, async_op=False)[source]

Gathers tensors from the specified communication group. The operation will gather the tensor from processes according to dimension 0.


  • Only the tensor in process dst (global rank) will keep the gathered tensor. The other process will keep a tensor list which has no mathematical meaning.

  • The tensors must have the same shape and format in all processes of the collection.

  • Only support PyNative mode, Graph mode is not currently supported.

  • tensor (Tensor) – The tensor to be gathered.

  • gather_list (list[Tensor]) – List of same-sized tensors to use for gathered data.

  • dst (int, optional) – Specifies the rank(global rank) of the process that receive the tensor. And only process dst will receive the gathered tensor. Default: 0 .

  • group (str, optional) – The communication group to work on. If None, which means "hccl_world_group" in Ascend. Default: None.

  • async_op (bool, optional) – Whether this operator should be an async operator. Default: False .


CommHandle, CommHandle is an async work handle, if async_op is set to True. CommHandle will be None, when async_op is False.

  • TypeError – If the type of input tensor is not Tensor, or gather_list is not Tensor list.

  • TypeError – If dst is not an integer, group is not a string or async_op is not bool.

  • TypeError – If size of gather_list is not equal to group size.

  • TypeError – If the type or shape of tensor not equal to the member of gather_list.

  • RuntimeError – If device target is invalid, or backend is invalid, or distributed initialization fails.

Supported Platforms:




Before running the following examples, you need to configure the communication environment variables.

For Ascend devices, it is recommended to use the msrun startup method without any third-party or configuration file dependencies. Please see the msrun start up for more details.

This example should be run with 2 devices.

>>> import numpy as np
>>> import mindspore as ms
>>> import mindspore.nn as nn
>>> from mindspore.mint.distributed import init_process_group, gather
>>> from mindspore import Tensor
>>> # Launch 2 processes.
>>> init_process_group()
>>> input = Tensor(np.arange(4).reshape([2, 2]).astype(np.float32))
>>> outputs = [Tensor(np.zeros([2, 2]).astype(np.float32)),Tensor(np.zeros([2, 2]).astype(np.float32))]
>>> gather(input, outputs, dst=0)
>>> print(outputs)
# rank_0
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00],
 [ 2.00000000e+00,  3.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00], [ 2.00000000e+00,  3.00000000e+00]])]
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00],
 [ 2.00000000e+00,  3.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  1.00000000e+00], [ 2.00000000e+00,  3.00000000e+00]])]
# rank_1
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00],
 [ 0.00000000e+00,  0.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00], [ 0.00000000e+00,  0.00000000e+00]])]
[Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00],
 [ 0.00000000e+00,  0.00000000e+00]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 0.00000000e+00,  0.00000000e+00], [ 0.00000000e+00,  0.00000000e+00]])]