mindspore_gl.dataset
Reading and building interface for graph datasets.
- class mindspore_gl.dataset.CoraV2(root)[source]
Cora Dataset, a source dataset for reading and parsing Cora dataset.
- Parameters
root (str) – path to the root directory that contains cora_v2_with_mask.npz.
- Raises
RuntimeError – If root does not contain data files.
Examples
>>> from mindspore_gl.dataset import CoraV2 >>> root = "path/to/cora_v2_with_mask.npz" >>> dataset = CoraV2(root)
About Cora dataset:
The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 10556 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words.
Statistics:
Nodes: 2708
Edges: 10556
Number of Classes: 7
Label split:
Train: 140
Valid: 500
Test: 1000
Dataset can be download here: <https://github.com/kimiyoung/planetoid> You can organize the dataset files into the following directory structure and read by process API.
. └── corav2 ├── ind.cora_v2.allx ├── ind.cora_v2.ally ├── ind.cora_v2.graph ├── ind.cora_v2.test.index ├── ind.cora_v2.tx ├── ind.cora_v2.ty ├── ind.cora_v2.x └── ind.cora_v2.y
- property adj_coo
Return the adjacency matrix of COO representation
- Returns
numpy.ndarray, array of coo matrix.
Examples
>>> #dataset is an instance object of Dataset >>> node_label = dataset.adj_coo()
- property adj_csr
Return the adjacency matrix of CSR representation.
- Returns
numpy.ndarray, array of csr matrix.
Examples
>>> #dataset is an instance object of Dataset >>> node_label = dataset.adj_csr()
- property edge_count
Number of edges
- Returns
int, length of csr col
Examples
>>> #dataset is an instance object of Dataset >>> edge_count = dataset.edge_count()
- property node_count
Number of nodes
- Returns
int, length of csr row
Examples
>>> #dataset is an instance object of Dataset >>> node_count = dataset.node_count()
- property node_feat
Node features
- Returns
numpy.ndarray, array of node feature
Examples
>>> #dataset is an instance object of Dataset >>> node_feat = dataset.node_feat()
- property node_label
Ground truth labels of each node
- Returns
numpy.ndarray, array of node label
Examples
>>> #dataset is an instance object of Dataset >>> node_label = dataset.node_label()
- property num_classes
Number of label classes
- Returns
int, the number of classes
Examples
>>> #dataset is an instance object of Dataset >>> num_classes = dataset.num_classes()
- property num_features
Feature size of each node
- Returns
int, the number of feature size
Examples
>>> #dataset is an instance object of Dataset >>> num_features = dataset.num_features()
- property test_mask
Mask of test nodes
- Returns
numpy.ndarray, array of mask
Examples
>>> #dataset is an instance object of Dataset >>> test_mask = dataset.test_mask()
- property train_mask
Mask of training nodes
- Returns
numpy.ndarray, array of mask
Examples
>>> #dataset is an instance object of Dataset >>> train_mask = dataset.train_mask()
- property train_nodes
training nodes indexes
- Returns
numpy.ndarray, array of training nodes
Examples
>>> #dataset is an instance object of Dataset >>> train_nodes = dataset.train_nodes()
- property val_mask
Mask of validation nodes
- Returns
numpy.ndarray, array of mask
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.val_mask()