mindspore_gl.dataset.Alchemy
- class mindspore_gl.dataset.Alchemy(root, datasize=10000)[source]
Alchemy dataset, a source dataset for reading and parsing Alchemy dataset.
About Alchemy dataset: The Tencent Quantum Lab has recently introduced a new molecular dataset, called Alchemy, to facilitate the development of new machine learning models useful for chemistry and materials science.
The dataset lists 12 quantum mechanical properties of 130,000+ organic molecules comprising up to 12 heavy atoms (C, N, O, S, F and Cl), sampled from the GDBMedChem database. These properties have been calculated using the open-source computational chemistry program Python-based Simulation of Chemistry Framework (PySCF).
Statistics:
Graphs: 99776
Nodes: 9.71
Edges: 10.02
Number of quantum mechanical properties: 12
Dataset can be download here: Alchemy dev and Alchemy valid .
You can organize the dataset files into the following directory structure and read.
. ├── dev │ ├── dev_target.csv │ └── sdf │ ├── atom_10 │ ├── atom_11 │ ├── atom_12 │ └── atom_9 └── valid ├── sdf │ ├── atom_11 │ └── atom_12 └── valid_target.csv
- Parameters
- Raises
TypeError – if root is not a str.
RuntimeError – if root does not contain data files.
ValueError – if datasize is more than 99776.
Examples
>>> from mindspore_gl.dataset import Alchemy >>> root = "path/to/alchemy" >>> dataset = Alchemy(root)
- property edge_feat
Edge features.
- Returns
numpy.ndarray, array of edge feature.
Examples
>>> #dataset is an instance object of Dataset >>> node_feat = dataset.edge_feat
- property edge_feat_size
Feature size of each edge.
- Returns
int, the number of feature size.
Examples
>>> #dataset is an instance object of Dataset >>> edge_feat_size = dataset.edge_feat_size
- property graph_count
Total graph numbers.
- Returns
int, numbers of graphs.
Examples
>>> #dataset is an instance object of Dataset >>> graph_count = dataset.graph_count
- graph_edge_feat(graph_idx)[source]
Graph edge features.
- Parameters
graph_idx (int) – index of graph.
- Returns
numpy.ndarray, edge feature of graph.
Examples
>>> #dataset is an instance object of Dataset >>> graph_edge_feat = dataset.graph_edge_feat(graph_idx)
- property graph_edges
Accumulative graph edges count.
- Returns
numpy.ndarray, array of accumulative edges.
Examples
>>> #dataset is an instance object of Dataset >>> graph_edges = dataset.graph_edges
- property graph_label
Graph label.
- Returns
numpy.ndarray, array of graph label.
Examples
>>> #dataset is an instance object of Dataset >>> node_feat = dataset.graph_label
- graph_node_feat(graph_idx)[source]
Graph node features.
- Parameters
graph_idx (int) – index of graph.
- Returns
numpy.ndarray, node feature of graph.
Examples
>>> #dataset is an instance object of Dataset >>> graph_node_feat = dataset.graph_node_feat(graph_idx)
- property graph_nodes
Accumulative graph nodes count.
- Returns
numpy.ndarray, array of accumulative nodes.
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.graph_nodes
- property node_feat
Node features.
- Returns
numpy.ndarray, array of node feature.
Examples
>>> #dataset is an instance object of Dataset >>> node_feat = dataset.node_feat
- property node_feat_size
Feature size of each node.
- Returns
int, the number of feature size.
Examples
>>> #dataset is an instance object of Dataset >>> node_feat_size = dataset.node_feat_size
- property num_classes
Graph label size.
- Returns
int, size of graph label.
Examples
>>> #dataset is an instance object of Dataset >>> num_classes = dataset.num_classes
- property train_graphs
Train graph ID.
- Returns
numpy.ndarray, array of train graph ID.
Examples
>>> #dataset is an instance object of Dataset >>> train_graphs = dataset.train_graphs
- property train_mask
Mask of training nodes.
- Returns
numpy.ndarray, array of mask.
Examples
>>> #dataset is an instance object of Dataset >>> train_mask = dataset.train_mask
- property val_graphs
Valid graph ID.
- Returns
numpy.ndarray, array of valid graph ID.
Examples
>>> #dataset is an instance object of Dataset >>> val_graphs = dataset.val_graphs
- property val_mask
Mask of validation nodes.
- Returns
numpy.ndarray, array of mask.
Examples
>>> #dataset is an instance object of Dataset >>> val_mask = dataset.val_mask