MindSpore Distributed Operator List
Linux
Ascend
GPU
CPU
Model Development
Beginner
Intermediate
Expert
Distributed Operator
op name |
constraints |
---|---|
None |
|
None |
|
None |
|
None |
|
None |
|
The logits can’t be split into the dimension of axis, otherwise it’s inconsistent with the single machine in the mathematical logic. |
|
The logits can’t be split into the dimension of axis, otherwise it’s inconsistent with the single machine in the mathematical logic. |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
Repeated calculation is not supported. |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
None |
|
The input_x can’t be split into the dimension of axis, otherwise it’s inconsistent with the single machine in the mathematical logic. |
|
Need to be used in conjunction with |
|
Need to be used in conjunction with |
|
Only support 1-dim and 2-dim parameters and the last dimension of the input_params should be 32-byte aligned; Scalar input_indices is not supported; Repeated calculation is not supported when the parameters are split in the dimension of the axis; Split input_indices and input_params at the same time is not supported. |
|
The same as GatherV2. |
|
The same as GatherV2. |
|
The input_x can’t be split into the dimension of axis, otherwise it’s inconsistent with the single machine in the mathematical logic. |
|
The last dimension of logits and labels can’t be splited; Only supports using output[0]. |
|
|
|
|
|
When the shape of weight is not [1], the shard strategy in channel dimension of input_x should be consistent with weight. |
|
Only support 1-dim indices. Must configure strategy for the output and the first and second inputs. |
|
None |
|
When the input_x is splited on the axis dimension, the distributed result may be inconsistent with that on the single machine. |
|
When the input_x is splited on the axis dimension, the distributed result may be inconsistent with that on the single machine. |
|
When the input_x is splited on the axis dimension, the distributed result may be inconsistent with that on the single machine. |
|
When the input_x is splited on the axis dimension, the distributed result may be inconsistent with that on the single machine. |
|
None |
|
Configuring shard strategy is not supported. |
|
Only support mask with all 0 values; The dimension needs to be split should be all extracted; Split is not supported when the strides of dimension is 1. |
|
Only support configuring shard strategy for multiples. |
|
None |
Repeated calculation means that the device is not fully used. For example, the cluster has 8 devices to run distributed training, the splitting strategy only cuts the input into 4 copies. In this case, double counting will occur.