mindspore.dataset.PKSampler
- class mindspore.dataset.PKSampler(num_val, num_class=None, shuffle=False, class_column="label", num_samples=None)[source]
Samples K elements for each P class in the dataset.
- Parameters
num_val (int) – Number of elements to sample for each class.
num_class (int, optional) – Number of classes to sample (default=None, sample all classes). The parameter does not supported to specify currently.
shuffle (bool, optional) – If True, the class IDs are shuffled, otherwise it will not be shuffled (default=False).
class_column (str, optional) – Name of column with class labels for MindDataset (default=’label’).
num_samples (int, optional) – The number of samples to draw (default=None, which means sample all elements).
Examples
>>> # creates a PKSampler that will get 3 samples from every class. >>> sampler = ds.PKSampler(3) >>> dataset = ds.ImageFolderDataset(image_folder_dataset_dir, ... num_parallel_workers=8, ... sampler=sampler)
- Raises
TypeError – If shuffle is not a boolean value.
TypeError – If class_column is not a str value.
TypeError – If num_samples is not an integer value.
NotImplementedError – If num_class is not None.
RuntimeError – If num_val is not a positive value.
ValueError – If num_samples is a negative value.
- add_child(sampler)
Add a sub-sampler for given sampler. The sub-sampler will receive all data from the output of parent sampler and apply its sample logic to return new samples.
- Parameters
sampler (Sampler) – Object used to choose samples from the dataset. Only builtin samplers(DistributedSampler, PKSampler, RandomSampler, SequentialSampler, SubsetRandomSampler, WeightedRandomSampler) are supported.
Examples
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3) >>> sampler.add_child(ds.RandomSampler(num_samples=2)) >>> dataset = ds.Cifar10Dataset(cifar10_dataset_dir, sampler=sampler)
- get_child()
Get the child sampler of given sampler.
- Returns
Sampler, The child sampler of given sampler.
Examples
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3) >>> sampler.add_child(ds.RandomSampler(num_samples=2)) >>> child_sampler = sampler.get_child()
- get_num_samples()
All samplers can contain a numeric num_samples value (or it can be set to None). A child sampler can exist or be None. If a child sampler exists, then the child sampler count can be a numeric value or None. These conditions impact the resultant sampler count that is used. The following table shows the possible results from calling this function.
child sampler
num_samples
child_samples
result
T
x
y
min(x, y)
T
x
None
x
T
None
y
y
T
None
None
None
None
x
n/a
x
None
None
n/a
None
- Returns
int, the number of samples, or None.
Examples
>>> sampler = ds.SequentialSampler(start_index=0, num_samples=3) >>> num_samplers = sampler.get_num_samples()