mindspore.ops.tensordump

mindspore.ops.tensordump(file_name, tensor, mode='out')[source]

Save tensor in npy format.

Warning

The parameter mode will no longer support the value 'all'.

In Parallel situation, tensordump will dump slice of data at each rank. In Ascend platform with graph mode, Your code OpA –> OpB may compiled as OpA –> RedistributionOps –> OpB.

Note: The redistribution operator is introduced, Due to inter-device communication and shard strategies in the static graph parallel scenario.

In case of OpA –> OpB, the dump data of OpA's output is equal to OpB's input.

But in case of OpA –> RedistributionOps –> OpB, The dump data of OpA's output is not equal to OpB's input (Due to the redistribution operators). So the parameter mode is to handle this situation.

Assuming OpA's output is used as both tensordump's input parameter and OpB's input parameter. Different requirements of saving dump data can be achieved by configuring parameter mode :

If the mode is 'out', the dump data contains only OpA's output slice.
If the mode is 'in', the dump data contains only OpB's input slice.

For mode 'in', the input slice npy file format is: fileName_dumpMode_dtype_id.npy.

For mode 'out', the output slice npy file format is: fileName_dtype_id.npy.

fileName: Value of the parameter file_name (if parameter file_name is a user-specified path, the value of fileName is the last level of the path).
dumpMode: Value of the parameter mode.
dtype: The original data type. Data of type bfloat16 stored in the .npy file will be converted to float32.
id: An auto increment ID.

Note

In Ascend platform with graph mode, the environment variables MS_DUMP_SLICE_SIZE and MS_DUMP_WAIT_TIME can be set to solve operator execution failure when outputting big tensor or outputting tensor intensively.
The operator of tensordump doesn't support in control flow.
If current parallel mode is STAND_ALONE, mode should only be 'out'.
This function is used for debugging.

Parameters

file_name (str) – The path of the npy file saves.
tensor (Tensor) – The tensor that user want to dump.
mode (str, optional) – Used to control tensordump behavior, available value is one of ['in', 'out']. Default out .

Supported Platforms:: Ascend

Examples

Note

Using msrun command to run below example: msrun –worker_num=2 –local_worker_num=2 –master_port=11450 –log_dir=msrun_log –join=True –cluster_time_out=300 tensordump_example.py

>>> import os
>>> import time
>>> import numpy as np
>>> import mindspore
>>> from mindspore import nn, context
>>> from mindspore.communication import init, get_rank
>>> from mindspore.parallel.auto_parallel import AutoParallel
>>> from mindspore.nn.utils import no_init_parameters
>>> init()
>>> rank_id = get_rank()
>>> dump_path = f'rank_{rank_id}_mul1_mul2.npy'
>>> class Net(nn.Cell):
...     def __init__(self, strategy1, strategy2):
...         super(Net, self).__init__()
...         self.matmul1 = mindspore.ops.MatMul().shard(strategy1)
...         self.matmul2 = mindspore.ops.MatMul().shard(strategy2)
...
...     def construct(self, x, y, b):
...         out1 = self.matmul1(x, y)
...         mindspore.ops.tensordump(dump_path, out1, 'out')
...         out2 = self.matmul2(out1, b)
...         return out2
...
>>> mindspore.set_context(mode=mindspore.GRAPH_MODE)
>>> os.environ["MS_DEV_SAVE_GRAPHS"] = "2"
>>> strategy1 = ((1, 2), (2, 1))
>>> strategy2 = ((1, 2), (2, 1))
>>> with no_init_parameters():
...     net = Net(strategy1, strategy2)
>>> x = mindspore.tensor(0.1 * mindspore.ops.randn(64, 64), mindspore.float32)
>>> y = mindspore.tensor(0.1 * mindspore.ops.randn(64, 64), mindspore.float32)
>>> b = mindspore.tensor(0.1 * mindspore.ops.randn(64, 64), mindspore.float32)
>>> parallel_net = AutoParallel(net, parallel_mode="semi_auto")
>>> parallel_net.dataset_strategy(config="full_batch")
>>> out = parallel_net(x, y, b)
>>> print(f"out shape is: {out.shape}")
out shape is (64, 64)
>>> time.sleep(0.5) # npy file is generated asynchronously, spend an interval time then load it.
>>> matmul1_output_slice = np.load(f'rank_{rank_id}_mul1_mul2_float32_0.npy')      # load matmul1's output slice
>>> print(f"matmul1_output_slice is loaded, shape is: {matmul1_output_slice.shape}")
matmul1_output_slice is loaded, shape is: (64, 64)