mindspore.dataset.Schema

class mindspore.dataset.Schema(schema_file=None)[source]

Class to represent a schema of a dataset.

Parameters: schema_file (str) – Path of schema file (default=None).
Returns: Schema object, schema info about dataset.
Raises: RuntimeError – If schema file failed to load.

Example

>>> import mindspore.dataset as ds
>>> import mindspore.common.dtype as mstype
>>>
>>> # Create schema; specify column name, mindspore.dtype and shape of the column
>>> schema = ds.Schema()
>>> schema.add_column('col1', de_type=mindspore.int64, shape=[2])

add_column(name, de_type, shape=None)[source]

Add new column to the schema.

Parameters

name (str) – Name of the column.
de_type (str) – Data type of the column.
shape (list[int], optional) – Shape of the column (default=None, [-1] which is an unknown shape of rank 1).

Raises

ValueError – If column type is unknown.

from_json(json_obj)[source]

Get schema file from JSON object.

Parameters

json_obj (dictionary) – Object of JSON parsed.

Raises

RuntimeError – if there is unknown item in the object.
RuntimeError – if dataset type is missing in the object.
RuntimeError – if columns are missing in the object.

parse_columns(columns)[source]

Parse the columns and add it to self.

Parameters

columns (Union[dict, list[dict], tuple[dict]]) –

Dataset attribute information, decoded from schema file.

list[dict], ‘name’ and ‘type’ must be in keys, ‘shape’ optional.
dict, columns.keys() as name, columns.values() is dict, and ‘type’ inside, ‘shape’ optional.

Raises

RuntimeError – If failed to parse columns.
RuntimeError – If column’s name field is missing.
RuntimeError – If column’s type field is missing.

Example

>>> schema = Schema()
>>> columns1 = [{'name': 'image', 'type': 'int8', 'shape': [3, 3]},
>>>             {'name': 'label', 'type': 'int8', 'shape': [1]}]
>>> schema.parse_columns(columns1)
>>> columns2 = {'image': {'shape': [3, 3], 'type': 'int8'}, 'label': {'shape': [1], 'type': 'int8'}}
>>> schema.parse_columns(columns2)

to_json()[source]

Get a JSON string of the schema.

Returns: str, JSON string of the schema.