mindspore_gl.dataset.PPI

View Source On Gitee
class mindspore_gl.dataset.PPI(root)[source]

PPI Dataset, a source dataset for reading and parsing PPI dataset.

About PPI dataset:

protein roles in various protein-protein interaction (PPI) graphs, with each graph corresponding to a different human tissue. positional gene sets are used, motif gene sets and immunological signatures as features and gene ontology sets as labels (121 in total), collected from the Molecular Signatures Database. The average graph contains 2373 nodes, with an average degree of 28.8.

Statistics:

  • Graphs: 24

  • Nodes: ~2245.3

  • Edges: ~61,318.4

  • Number of Classes: 121

  • Label split:

    • Train examples: 20

    • Valid examples: 2

    • Test examples: 2

Dataset can be download here: PPI.

You can organize the dataset files into the following directory structure and read.

.
└── ppi
    ├── valid_feats.npy
    ├── valid_labels.npy
    ├── valid_graph_id.npy
    ├── valid_graph.json
    ├── train_feats.npy
    ├── train_labels.npy
    ├── train_graph_id.npy
    ├── train_graph.json
    ├── test_feats.npy
    ├── test_labels.npy
    ├── test_graph_id.npy
    └── test_graph.json
Parameters

root (str) – path to the root directory that contains ppi_with_mask.npz.

Raises

Examples

>>> from mindspore_gl.dataset.ppi import PPI
>>> root = "path/to/ppi"
>>> dataset = PPI(root)
property graph_count

Total graph numbers.

Returns

  • int, numbers of graph.

Examples

>>> #dataset is an instance object of Dataset
>>> graph_count = dataset.graph_count
property graph_edges

Accumulative graph edges count.

Returns

  • numpy.ndarray, array of accumulative edges.

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.graph_edges
graph_node_feat(graph_idx)[source]

Graph features.

Parameters

graph_idx (int) – index of graph.

Returns

  • numpy.ndarray, node feature of graph.

Examples

>>> #dataset is an instance object of Dataset
>>> graph_node_feat = dataset.graph_node_feat(graph_idx)
graph_node_label(graph_idx)[source]

Graph Node label.

Parameters

graph_idx (int) – index of graph.

Returns

  • numpy.ndarray, node label of graph.

Examples

>>> #dataset is an instance object of Dataset
>>> graph_node_label = dataset.graph_node_label(graph_idx)
property graph_nodes

Accumulative graph nodes count.

Returns

  • numpy.ndarray, array of accumulative nodes.

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.graph_nodes
property node_feat

Node features.

Returns

  • numpy.ndarray, array of node feature.

Examples

>>> #dataset is an instance object of Dataset
>>> node_feat = dataset.node_feat
property node_feat_size

Feature size of each node.

Returns

  • int, the number of feature size.

Examples

>>> #dataset is an instance object of Dataset
>>> node_feat_size = dataset.node_feat_size
property node_label

Ground truth labels of each node.

Returns

  • numpy.ndarray, array of node label.

Examples

>>> #dataset is an instance object of Dataset
>>> node_label = dataset.node_label
property num_classes

Number of label classes.

Returns

  • int, the number of classes.

Examples

>>> #dataset is an instance object of Dataset
>>> num_classes = dataset.num_classes
property test_graphs

Test graph ID.

Returns

  • numpy.ndarray, array of test graph ID.

Examples

>>> #dataset is an instance object of Dataset
>>> test_graphs = dataset.test_graphs
property test_mask

Mask of test nodes.

Returns

  • numpy.ndarray, array of mask.

Examples

>>> #dataset is an instance object of Dataset
>>> test_mask = dataset.test_mask
property train_graphs

Train graph ID.

Returns

  • numpy.ndarray, array of train graph ID.

Examples

>>> #dataset is an instance object of Dataset
>>> train_graphs = dataset.train_graphs
property train_mask

Mask of training nodes.

Returns

  • numpy.ndarray, array of mask.

Examples

>>> #dataset is an instance object of Dataset
>>> train_mask = dataset.train_mask
property val_graphs

Valid graph ID.

Returns

  • numpy.ndarray, array of valid graph ID.

Examples

>>> #dataset is an instance object of Dataset
>>> val_graphs = dataset.val_graphs
property val_mask

Mask of validation nodes.

Returns

  • numpy.ndarray, array of mask.

Examples

>>> #dataset is an instance object of Dataset
>>> val_mask = dataset.val_mask