mindspore.data_sink
- mindspore.data_sink(fn, dataset, sink_size=1, jit_config=None, input_signature=None)[source]
A wrapper function to generate a function for the input function.
Note
When using data sinking, the dataset will automatically cycle. At this time, only the total number of training steps (total_step) and the number of steps for each sinking (sink_size) need to be considered. When switching from training by rounds (epochs) to data sinking, the formula is as follows:
total_step = epochs * dataset_size
train_sink_step = total_step / sink_size
After transforming from mindspore.data_sink, you need to execute train_sink_step step for training.
- Parameters
fn (Function) – The Python function that will be run with dataset.
dataset (Dataset) – The dataset iterator. The dataset can be generated by dataset generator API in
mindspore.dataset
, such asmindspore.dataset.ImageFolderDataset
.sink_size (int) – Control the amount of data in each sink. sink_size must be positive integer. Default:
1
.jit_config (JitConfig) – Controls the execution mode(Graph mode/PyNative mode) of the generated function, and Jit config for compile. Default:
None
, means running in PyNative mode.input_signature (Union[Tensor, List or Tuple of Tensors]) – The Tensor which describes the input arguments. The shape and dtype of the Tensor will be supplied to this function. If input_signature is specified, each input to fn must be a Tensor. And the input parameters of fn cannot accept **kwargs. The shape and dtype of actual inputs should keep the same as input_signature. Otherwise, TypeError will be raised. Default:
None
.
- Returns
Function, the generated function will be executed in data sinking mode.
- Raises
ValueError – If sink_size is not positive integer.
- Supported Platforms:
Ascend
GPU
Examples
>>> import numpy as np >>> import mindspore as ms >>> from mindspore import dataset as ds >>> >>> data = {"x": np.ones((1,), dtype=np.int32), "y": np.ones((1,), dtype=np.int32)} >>> dataset = ds.NumpySlicesDataset(data=data) >>> >>> def func_net(x, y): ... out = x + y ... return out >>> >>> sink_process = ms.data_sink(func_net, dataset, sink_size=1) >>> for _ in range(2): ... out = sink_process() ... print(out) 2 2