mindspore_gl.dataset.PPI
- class mindspore_gl.dataset.PPI(root)[source]
PPI Dataset, a source dataset for reading and parsing PPI dataset.
About PPI dataset:
protein roles in various protein-protein interaction (PPI) graphs, with each graph corresponding to a different human tissue. positional gene sets are used, motif gene sets and immunological signatures as features and gene ontology sets as labels (121 in total), collected from the Molecular Signatures Database. The average graph contains 2373 nodes, with an average degree of 28.8.
Statistics:
Graphs: 24
Nodes: ~2245.3
Edges: ~61,318.4
Number of Classes: 121
Label split:
Train examples: 20
Valid examples: 2
Test examples: 2
Dataset can be download here: PPI.
You can organize the dataset files into the following directory structure and read.
. └── ppi ├── valid_feats.npy ├── valid_labels.npy ├── valid_graph_id.npy ├── valid_graph.json ├── train_feats.npy ├── train_labels.npy ├── train_graph_id.npy ├── train_graph.json ├── test_feats.npy ├── test_labels.npy ├── test_graph_id.npy └── test_graph.json
- Parameters
root (str) – path to the root directory that contains ppi_with_mask.npz.
- Raises
TypeError – if root is not a str.
RuntimeError – if root does not contain data files.
Examples
>>> from mindspore_gl.dataset.ppi import PPI >>> root = "path/to/ppi" >>> dataset = PPI(root)
- property graph_count
Total graph numbers.
- Returns
int, numbers of graph.
Examples
>>> #dataset is an instance object of Dataset >>> graph_count = dataset.graph_count
- property graph_edges
Accumulative graph edges count.
- Returns
numpy.ndarray, array of accumulative edges.
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.graph_edges
- graph_node_feat(graph_idx)[source]
Graph features.
- Parameters
graph_idx (int) – index of graph.
- Returns
numpy.ndarray, node feature of graph.
Examples
>>> #dataset is an instance object of Dataset >>> graph_node_feat = dataset.graph_node_feat(graph_idx)
- graph_node_label(graph_idx)[source]
Graph Node label.
- Parameters
graph_idx (int) – index of graph.
- Returns
numpy.ndarray, node label of graph.
Examples
>>> #dataset is an instance object of Dataset >>> graph_node_label = dataset.graph_node_label(graph_idx)
- property graph_nodes
Accumulative graph nodes count.
- Returns
numpy.ndarray, array of accumulative nodes.
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.graph_nodes
- property node_feat
Node features.
- Returns
numpy.ndarray, array of node feature.
Examples
>>> #dataset is an instance object of Dataset >>> node_feat = dataset.node_feat
- property node_feat_size
Feature size of each node.
- Returns
int, the number of feature size.
Examples
>>> #dataset is an instance object of Dataset >>> node_feat_size = dataset.node_feat_size
- property node_label
Ground truth labels of each node.
- Returns
numpy.ndarray, array of node label.
Examples
>>> #dataset is an instance object of Dataset >>> node_label = dataset.node_label
- property num_classes
Number of label classes.
- Returns
int, the number of classes.
Examples
>>> #dataset is an instance object of Dataset >>> num_classes = dataset.num_classes
- property test_graphs
Test graph ID.
- Returns
numpy.ndarray, array of test graph ID.
Examples
>>> #dataset is an instance object of Dataset >>> test_graphs = dataset.test_graphs
- property test_mask
Mask of test nodes.
- Returns
numpy.ndarray, array of mask.
Examples
>>> #dataset is an instance object of Dataset >>> test_mask = dataset.test_mask
- property train_graphs
Train graph ID.
- Returns
numpy.ndarray, array of train graph ID.
Examples
>>> #dataset is an instance object of Dataset >>> train_graphs = dataset.train_graphs
- property train_mask
Mask of training nodes.
- Returns
numpy.ndarray, array of mask.
Examples
>>> #dataset is an instance object of Dataset >>> train_mask = dataset.train_mask
- property val_graphs
Valid graph ID.
- Returns
numpy.ndarray, array of valid graph ID.
Examples
>>> #dataset is an instance object of Dataset >>> val_graphs = dataset.val_graphs
- property val_mask
Mask of validation nodes.
- Returns
numpy.ndarray, array of mask.
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.val_mask