mindspore.dataset.Dataset.project

Dataset.project(columns)[source]

The specified columns will be selected from the dataset and passed into the pipeline with the order specified. The other columns are discarded.

Parameters: columns (Union[str, list[str]]) – List of names of the columns to project.
Returns: Dataset, a new dataset with the above operation applied.

Examples

>>> import mindspore.dataset as ds
>>> # Create a dataset with 3 columns
>>> input_columns = ["column1", "column2", "column3"]
>>> dataset = ds.GeneratorDataset([(1, 2, 3), (3, 4, 5), (5, 6, 7)], column_names=input_columns)
>>>
>>> columns_to_project = ["column3", "column1", "column2"]
>>> # in that order, regardless of the original order of columns.
>>> dataset = dataset.project(columns=columns_to_project)