mindspore_gl.dataset.IMDBBinary
- class mindspore_gl.dataset.IMDBBinary(root)[source]
IMDBBinary Dataset, a source dataset for reading and parsing IMDBBinary dataset.
About IMDBBinary dataset:
IMDBBinary Dataset, a source dataset for reading and parsing IMDBBinary dataset. IMDB-BINARY is a movie collaboration dataset that consists of the ego-networks of 1,000 actors/actresses who played roles in movies in IMDB. In each graph, nodes represent actors/actress, and there is an edge between them if they appear in the same movie. These graphs are derived from the Action and Romance genres.
Statistics:
Nodes: 19773
Edges: 193062
Number of Graphs: 1000
Number of Classes: 2
Label split:
Train: 800
Valid: 200
Dataset can be download here: <https://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/IMDB-BINARY.zip> You can organize the dataset files into the following directory structure and read by process API.
. ├── IMDB-BINARY_A.txt ├── IMDB-BINARY_graph_indicator.txt └── IMDB-BINARY_graph_labels.txt
- Parameters
root (str) – path to the root directory that contains imdb_binary_with_mask.npz
- Raises
TypeError – if root is not a str.
RuntimeError – if root does not contain data files.
Examples
>>> from mindspore_gl.dataset.imdb_binary import IMDBBinary >>> root = "path/to/imdb_binary" >>> dataset = IMDBBinary(root)
- property graph_count
Total graph numbers
- Returns
int, numbers of graph
Examples
>>> #dataset is an instance object of Dataset >>> graph_count = dataset.graph_count
- property graph_edges
Accumulative graph edges count
- Returns
numpy.ndarray, array of accumulative edges
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.graph_edges
- graph_feat(graph_idx)[source]
graph features.
- Parameters
graph_idx (int) – index of graph.
- Returns
numpy.ndarray, node feature of graph.
Examples
>>> #dataset is an instance object of Dataset >>> graph_feat = dataset.graph_feat(graph_idx)
- property graph_label
Graph label
- Returns
numpy.ndarray, array of graph label
Examples
>>> #dataset is an instance object of Dataset >>> graph_label = dataset.graph_label
- property graph_nodes
Accumulative graph nodes count
- Returns
numpy.ndarray, array of accumulative nodes
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.graph_nodes
- property node_feat
Node features
- Returns
numpy.ndarray, array of node feature
Examples
>>> #dataset is an instance object of Dataset >>> node_feat = dataset.node_feat
- property num_classes
Number of label classes
- Returns
int, the number of classes
Examples
>>> #dataset is an instance object of Dataset >>> num_classes = dataset.num_classes
- property num_edge_features
Feature size of each edge
- Returns
int, the number of feature size
Examples
>>> #dataset is an instance object of Dataset >>> num_edge_features = dataset.num_edge_features
- property num_features
Feature size of each node
- Returns
int, the number of feature size
Examples
>>> #dataset is an instance object of Dataset >>> num_features = dataset.num_features
- property train_graphs
Train graph id
- Returns
numpy.ndarray, array of train graph id
Examples
>>> #dataset is an instance object of Dataset >>> train_graphs = dataset.train_graphs
- property train_mask
Mask of training nodes
- Returns
numpy.ndarray, array of mask
Examples
>>> #dataset is an instance object of Dataset >>> train_mask = dataset.train_mask
- property val_graphs
Valid graph id
- Returns
numpy.ndarray, array of valid graph id
Examples
>>> #dataset is an instance object of Dataset >>> val_graphs = dataset.val_graphs
- property val_mask
Mask of validation nodes
- Returns
numpy.ndarray, array of mask
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.val_mask