mindspore.parallel.shard
- mindspore.parallel.shard(fn, in_strategy, out_strategy=None, parameter_plan=None)[source]
Specify the input and output slicing strategy for a Cell or function. In PyNative mode, use this method to specify a Cell for distributed execution in graph mode. In Graph mode, use this method to specify distribution strategy for a Cell, strategy for others will be set by sharding propagation. in_strategy and out_strategy define the input and output layout respectively. in_strategy/out_strategy should be a tuple, each element of which corresponds to the desired layout of this input/output, and None represents data_parallel, which can refer to the description of
mindspore.ops.Primitive.shard()
. The parallel strategies of remaining operators are derived from the strategy specified by the input and output.Note
It is valid only in semi auto parallel or auto parallel mode. In other parallel modes, strategies set here will be ignored.
If the input contain Parameter, its strategy should be set in in_strategy.
- Parameters
fn (Union[Cell, Function]) – Function to be executed in parallel. Its arguments and return value must be Tensor. If fn is a Cell with parameters, fn needs to be an instantiated object, otherwise its arguments cannot be accessed.
in_strategy (tuple) – Define the layout of inputs, each element of the tuple should be a tuple(int) or tuple(mindspore.parallel.Layout). Tuple defines the layout of the corresponding input.
out_strategy (Union[tuple, None], optional) – Define the layout of outputs similar with in_strategy. Default:
None
.parameter_plan (Union[dict, None], optional) – Define the layout for the specified parameters. Each element in dict defines the layout of the parameter like "param_name: layout". The key is a parameter name of type 'str'. The value is a 1-D integer tuple or a 1-D mindspore.parallel.Layout tuple, indicating the corresponding layout. If the parameter name is incorrect or the corresponding parameter has been set, the parameter setting will be ignored. Supported only when fn is a Cell with parameters. Default:
None
.
- Returns
Function, return the function that will be executed under auto parallel process.
- Raises
AssertionError – If parallel mode is not "auto_parallel" nor "semi_auto_parallel".
TypeError – If in_strategy is not a tuple.
TypeError – If out_strategy is not a tuple or None.
TypeError – If any element in in_strategy is not a tuple(int) or tuple(mindspore.parallel.Layout).
TypeError – If any element in out_strategy is not a tuple(int) or tuple(mindspore.parallel.Layout).
TypeError – If parameter_plan is not a dict or None.
TypeError – If any key in parameter_plan is not a str.
TypeError – If any value in parameter_plan is not a tuple(int) or a tuple(mindspore.parallel.Layout).
- Supported Platforms:
Ascend
Examples
>>> import numpy as np >>> import mindspore as ms >>> from mindspore import Tensor, nn, ops >>> from mindspore.communication import init >>> from mindspore.parallel import shard >>> from mindspore.parallel import Layout >>> from mindspore.nn.utils import no_init_parameters >>> from mindspore.parallel.auto_parallel import AutoParallel >>> ms.set_context(mode=ms.GRAPH_MODE) >>> init() >>> >>> # Case 1: cell uses functional >>> class BasicBlock(nn.Cell): >>> def __init__(self): >>> super(BasicBlock, self).__init__() >>> self.dense1 = nn.Dense(64, 64) >>> self.gelu = nn.GELU() >>> def my_add(x, y): >>> x = ops.abs(x) >>> return x + y >>> # shard a function with tuple(int) strategies >>> self.shard_my_add = shard(my_add, in_strategy=((2, 2), (1, 4)), out_strategy=((4, 1),)) >>> >>> def construct(self, x, u): >>> x = self.gelu(x) >>> y = self.gelu(u) >>> y = x * y >>> x = self.dense1(x) >>> x = self.shard_my_add(x, y) >>> return x >>> >>> class NetForward(nn.Cell): >>> def __init__(self): >>> super(NetForward, self).__init__() >>> self.block1 = BasicBlock() >>> self.block2 = BasicBlock() >>> self.matmul = ops.MatMul() >>> >>> def construct(self, x, y): >>> x = self.matmul(x, y) >>> x = self.block1(x, x) >>> x = self.block2(x, x) >>> return x >>> >>> class Net(nn.Cell): >>> def __init__(self): >>> super(Net, self).__init__() >>> # setting cell sharding strategy and parameter_plan by tuple(int) >>> self.layer_net1 = NetForward() >>> self.layer_net1_shard = shard(self.layer_net1, in_strategy=((4, 2), (2, 1)), ... parameter_plan={"self.layer_net1.block1.weight": (4, 1)}) >>> >>> # setting cell sharding strategy and parameter_plan by tuple(ms.Layout) >>> self.layer_net2 = NetForward() >>> layout = Layout((4, 2, 1), ("dp", "mp", "sp")) >>> in_layout = (layout("dp", "mp"), layout("mp", "sp")) >>> param_layout = layout("dp", "sp") >>> self.layer_net2_shard = shard(self.layer_net2, in_strategy=in_layout, ... parameter_plan={"self.layer_net2.block2.weight": param_layout}) >>> self.flatten = nn.Flatten() >>> self.layer1 = nn.Dense(64, 64) >>> self.layer2 = nn.Dense(64, 32) >>> self.add = ops.Add() >>> self.matmul = ops.MatMul() >>> >>> def construct(self, x, y): >>> x = self.flatten(x) >>> y = self.flatten(y) >>> x = self.layer1(x) >>> x = self.layer_net1_shard(x, y) >>> x = self.layer_net2_shard(x, y) >>> x = self.layer2(x) >>> x = self.matmul(x, Tensor(np.ones(shape=(32, 32)), dtype=ms.float32)) >>> return x >>> >>> with no_init_parameters(): >>> net = Net() >>> x = Tensor(np.ones(shape=(64, 1, 8, 8)), dtype=ms.float32) >>> y = Tensor(np.ones(shape=(64, 1, 8, 8)), dtype=ms.float32) >>> parallel_net = AutoParallel(net, parallel_mode='sharding_propagation') >>> parallel_net(x, y) >>> >>> # Case 2: function uses functional sharding >>> def test_shard(x, y): ... return x + y >>> x = Tensor(np.ones(shape=(32, 10)), dtype=ms.float32) >>> y = Tensor(np.ones(shape=(32, 10)), dtype=ms.float32) >>> output = shard(test_shard, in_strategy=((4, 2), (4, 2)))(x, y) >>> print(output.shape) (32, 10)