mindspore.dataset.config¶
The configuration module provides various functions to set and get the supported configuration parameters, and read a configuration file.
-
mindspore.dataset.config.
set_seed
(seed)¶ Set the seed to be used in any random generator. This is used to produce deterministic results.
Note
This set_seed function sets the seed in the Python random library and numpy.random library for deterministic Python augmentations using randomness. This set_seed function should be called with every iterator created to reset the random seed. In the pipeline, this does not guarantee deterministic results with num_parallel_workers > 1.
- Parameters
seed (int) – Seed to be set.
- Raises
ValueError – If seed is invalid (< 0 or > MAX_UINT_32).
Examples
>>> import mindspore.dataset as ds >>> >>> # Set a new global configuration value for the seed value. >>> # Operations with randomness will use the seed value to generate random values. >>> ds.config.set_seed(1000)
-
mindspore.dataset.config.
get_seed
()¶ Get the seed.
- Returns
int, seed.
-
mindspore.dataset.config.
set_prefetch_size
(size)¶ Set the number of rows to be prefetched.
- Parameters
size (int) – Total number of rows to be prefetched.
- Raises
ValueError – If prefetch_size is invalid (<= 0 or > MAX_INT_32).
Examples
>>> import mindspore.dataset as ds >>> >>> # Set a new global configuration value for the prefetch size. >>> ds.config.set_prefetch_size(1000)
-
mindspore.dataset.config.
get_prefetch_size
()¶ Get the prefetch size in number of rows.
- Returns
int, total number of rows to be prefetched.
-
mindspore.dataset.config.
set_num_parallel_workers
(num)¶ Set the default number of parallel workers.
- Parameters
num (int) – Number of parallel workers to be used as a default for each operation.
- Raises
ValueError – If num_parallel_workers is invalid (<= 0 or > MAX_INT_32).
Examples
>>> import mindspore.dataset as ds >>> >>> # Set a new global configuration value for the number of parallel workers. >>> # Now parallel dataset operators will run with 8 workers. >>> ds.config.set_num_parallel_workers(8)
-
mindspore.dataset.config.
get_num_parallel_workers
()¶ Get the default number of parallel workers. This is the DEFAULT num_parallel_workers value used for each op, it is not related to AutoNumWorker feature.
- Returns
int, number of parallel workers to be used as a default for each operation.
-
mindspore.dataset.config.
set_monitor_sampling_interval
(interval)¶ Set the default interval (in milliseconds) for monitor sampling.
- Parameters
interval (int) – Interval (in milliseconds) to be used for performance monitor sampling.
- Raises
ValueError – If interval is invalid (<= 0 or > MAX_INT_32).
Examples
>>> import mindspore.dataset as ds >>> >>> # Set a new global configuration value for the monitor sampling interval. >>> ds.config.set_monitor_sampling_interval(100)
-
mindspore.dataset.config.
get_monitor_sampling_interval
()¶ Get the default interval of performance monitor sampling.
- Returns
int, interval (in milliseconds) for performance monitor sampling.
-
mindspore.dataset.config.
load
(file)¶ Load configurations from a file.
- Parameters
file (str) – Path of the configuration file to be loaded.
- Raises
RuntimeError – If file is invalid and parsing fails.
Examples
>>> import mindspore.dataset as ds >>> >>> # Set new default configuration values according to values in the configuration file. >>> ds.config.load("path/to/config/file") >>> # example config file: >>> # { >>> # "logFilePath": "/tmp", >>> # "numParallelWorkers": 4, >>> # "seed": 5489, >>> # "monitorSamplingInterval": 30 >>> # }
-
mindspore.dataset.config.
get_callback_timeout
()¶ Get the default timeout for DSWaitedCallback. In case of a deadlock, the wait function will exit after the timeout period.
- Returns
int, the duration in seconds.
-
mindspore.dataset.config.
set_auto_num_workers
(enable)¶ Set num_parallel_workers for each op automatically. (This feature is turned off by default) If turned on, the num_parallel_workers in each op will be adjusted automatically, possibly overwriting the num_parallel_workers passed in by user or the default value (if user doesn’t pass anything) set by ds.config.set_num_parallel_workers(). For now, this function is only optimized for Yolo3 dataset with per_batch_map (running map in batch). This feature aims to provide a baseline for optimized num_workers assignment for each op. Op whose num_parallel_workers is adjusted to a new value will be logged.
- Parameters
enable (bool) – Whether to enable auto num_workers feature or not.
- Raises
TypeError – If enable is not of boolean type.
Examples
>>> import mindspore.dataset as ds >>> >>> # Enable auto_num_worker feature, this might override the num_parallel_workers passed in by user >>> ds.config.set_auto_num_workers(True)
-
mindspore.dataset.config.
get_auto_num_workers
()¶ Get the setting (turned on or off) automatic number of workers.
- Returns
bool, whether auto num worker feature is turned on.
Examples
>>> ds.config.get_auto_num_workers()