mindearth.data.Dataset
- class mindearth.data.Dataset(dataset_generator, distribute=False, num_workers=1, shuffle=True)[source]
Create the dataset for training, validation and testing, and output an instance of class mindspore.dataset.GeneratorDataset.
- Parameters
dataset_generator (Data) – the data generator of weather dataset.
distribute (bool, optional) – whether or not to perform parallel training. Default: False.
num_workers (int, optional) – number of workers(threads) to process the dataset in parallel. Default: 1.
shuffle (bool, optional) – whether or not to perform shuffle on the dataset. Random accessible input is required. Default: True, expected order behavior shown in the table.
- Supported Platforms:
Ascend
GPU
Examples
>>> from mindearth.data import Era5Data, Dataset >>> data_params = { ... 'name': 'era5', ... 'root_dir': './dataset', ... 'feature_dims': 69, ... 't_in': 1, ... 't_out_train': 1, ... 't_out_valid': 20, ... 't_out_test': 20, ... 'valid_interval': 1, ... 'test_interval': 1, ... 'train_interval': 1, ... 'pred_lead_time': 6, ... 'data_frequency': 6, ... 'train_period': [2015, 2015], ... 'valid_period': [2016, 2016], ... 'test_period': [2017, 2017], ... 'patch': True, ... 'patch_size': 8, ... 'batch_size': 8, ... 'num_workers': 1, ... 'grid_resolution': 1.4, ... 'h_size': 128, ... 'w_size': 256 ... } >>> dataset_generator = Era5Data(data_params) >>> dataset = Dataset(dataset_generator) >>> train_dataset = dataset.create_dataset(1)