mindspore.dataset.PKSampler

class mindspore.dataset.PKSampler(num_val, num_class=None, shuffle=False, class_column='label', num_samples=None)[source]

Samples K elements for each P class in the dataset.

Parameters
  • num_val (int) – Number of elements to sample for each class.

  • num_class (int, optional) – Number of classes to sample. Default: None , sample all classes. The parameter does not support to specify currently.

  • shuffle (bool, optional) – Whether to shuffle the class IDs. Default: False.

  • class_column (str, optional) – Name of column with class labels for MindDataset. Default: 'label'.

  • num_samples (int, optional) – The number of samples to draw. Default: None , which means sample all elements.

Raises

Examples

>>> import mindspore.dataset as ds
>>> # creates a PKSampler that will get 3 samples from every class.
>>> sampler = ds.PKSampler(3)
>>> dataset = ds.ImageFolderDataset(image_folder_dataset_dir,
...                                 num_parallel_workers=8,
...                                 sampler=sampler)
add_child(sampler)

Add a sub-sampler for given sampler. The parent will receive all data from the output of sub-sampler sampler and apply its sample logic to return new samples.

Parameters

sampler (Sampler) – Object used to choose samples from the dataset. Only builtin samplers(mindspore.dataset.DistributedSampler , mindspore.dataset.PKSampler, mindspore.dataset.RandomSampler, mindspore.dataset.SequentialSampler, mindspore.dataset.SubsetRandomSampler, mindspore.dataset.WeightedRandomSampler ) are supported.

Examples

>>> import mindspore.dataset as ds
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3)
>>> sampler.add_child(ds.RandomSampler(num_samples=4))
>>> dataset = ds.Cifar10Dataset(cifar10_dataset_dir, sampler=sampler)
get_child()

Get the child sampler of given sampler.

Returns

Sampler, The child sampler of given sampler.

Examples

>>> import mindspore.dataset as ds
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3)
>>> sampler.add_child(ds.RandomSampler(num_samples=2))
>>> child_sampler = sampler.get_child()
get_num_samples()

Get num_samples value of the current sampler instance. This parameter can be optionally passed in when defining the Sampler. Default: None. This method will return the num_samples value. If the current sampler has child samplers, it will continue to access the child samplers and process the obtained value according to certain rules.

The following table shows the various possible combinations, and the final results returned.

child sampler

num_samples

child_samples

result

T

x

y

min(x, y)

T

x

None

x

T

None

y

y

T

None

None

None

None

x

n/a

x

None

None

n/a

None

Returns

int, the number of samples, or None.

Examples

>>> import mindspore.dataset as ds
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3)
>>> num_samplers = sampler.get_num_samples()