mindspore.dataset.GraphData
- class mindspore.dataset.GraphData(dataset_file, num_parallel_workers=None, working_mode='local', hostname='127.0.0.1', port=50051, num_client=1, auto_shutdown=True)[source]
Reads the graph dataset used for GNN training from the shared file and database.
- Parameters
dataset_file (str) – One of file names in the dataset.
num_parallel_workers (int, optional) – Number of workers to process the dataset in parallel (default=None).
working_mode (str, optional) –
Set working mode, now supports ‘local’/’client’/’server’ (default=’local’).
’local’, used in non-distributed training scenarios.
’client’, used in distributed training scenarios. The client does not load data, but obtains data from the server.
’server’, used in distributed training scenarios. The server loads the data and is available to the client.
hostname (str, optional) – Hostname of the graph data server. This parameter is only valid when working_mode is set to ‘client’ or ‘server’ (default=’127.0.0.1’).
port (int, optional) – Port of the graph data server. The range is 1024-65535. This parameter is only valid when working_mode is set to ‘client’ or ‘server’ (default=50051).
num_client (int, optional) – Maximum number of clients expected to connect to the server. The server will allocate resources according to this parameter. This parameter is only valid when working_mode is set to ‘server’ (default=1).
auto_shutdown (bool, optional) – Valid when working_mode is set to ‘server’, when the number of connected clients reaches num_client and no client is being connected, the server automatically exits (default=True).
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.get_all_nodes(0) >>> features = data_graph.get_node_feature(nodes, [1])
- get_all_edges(edge_type)[source]
Get all edges in the graph.
- Parameters
edge_type (int) – Specify the type of edge.
- Returns
numpy.ndarray, array of edges.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.get_all_edges(0)
- Raises
TypeError – If edge_type is not integer.
- get_all_neighbors(node_list, neighbor_type)[source]
Get neighbor_type neighbors of the nodes in node_list.
- Parameters
node_list (Union[list, numpy.ndarray]) – The given list of nodes.
neighbor_type (int) – Specify the type of neighbor.
- Returns
numpy.ndarray, array of neighbors.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.get_all_nodes(0) >>> neighbors = data_graph.get_all_neighbors(nodes, 0)
- get_all_nodes(node_type)[source]
Get all nodes in the graph.
- Parameters
node_type (int) – Specify the type of node.
- Returns
numpy.ndarray, array of nodes.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.get_all_nodes(0)
- Raises
TypeError – If node_type is not integer.
- get_edge_feature(edge_list, feature_types)[source]
Get feature_types feature of the edges in edge_list.
- Parameters
edge_list (Union[list, numpy.ndarray]) – The given list of edges.
feature_types (Union[list, numpy.ndarray]) – The given list of feature types.
- Returns
numpy.ndarray, array of features.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> edges = data_graph.get_all_edges(0) >>> features = data_graph.get_edge_feature(edges, [1])
- get_neg_sampled_neighbors(node_list, neg_neighbor_num, neg_neighbor_type)[source]
Get neg_neighbor_type negative sampled neighbors of the nodes in node_list.
- Parameters
node_list (Union[list, numpy.ndarray]) – The given list of nodes.
neg_neighbor_num (int) – Number of neighbors sampled.
neg_neighbor_type (int) – Specify the type of negative neighbor.
- Returns
numpy.ndarray, array of neighbors.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.get_all_nodes(0) >>> neg_neighbors = data_graph.get_neg_sampled_neighbors(nodes, 5, 0)
- get_node_feature(node_list, feature_types)[source]
Get feature_types feature of the nodes in node_list.
- Parameters
node_list (Union[list, numpy.ndarray]) – The given list of nodes.
feature_types (Union[list, numpy.ndarray]) – The given list of feature types.
- Returns
numpy.ndarray, array of features.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.get_all_nodes(0) >>> features = data_graph.get_node_feature(nodes, [1])
- get_nodes_from_edges(edge_list)[source]
Get nodes from the edges.
- Parameters
edge_list (Union[list, numpy.ndarray]) – The given list of edges.
- Returns
numpy.ndarray, array of nodes.
- Raises
TypeError – If edge_list is not list or ndarray.
- get_sampled_neighbors(node_list, neighbor_nums, neighbor_types)[source]
Get sampled neighbor information.
The api supports multi-hop neighbor sampling. That is, the previous sampling result is used as the input of next-hop sampling. A maximum of 6-hop are allowed.
The sampling result is tiled into a list in the format of [input node, 1-hop sampling result, 2-hop samling result …]
- Parameters
node_list (Union[list, numpy.ndarray]) – The given list of nodes.
neighbor_nums (Union[list, numpy.ndarray]) – Number of neighbors sampled per hop.
neighbor_types (Union[list, numpy.ndarray]) – Neighbor type sampled per hop.
- Returns
numpy.ndarray, array of neighbors.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.get_all_nodes(0) >>> neighbors = data_graph.get_sampled_neighbors(nodes, [2, 2], [0, 0])
- graph_info()[source]
Get the meta information of the graph, including the number of nodes, the type of nodes, the feature information of nodes, the number of edges, the type of edges, and the feature information of edges.
- Returns
dict, meta information of the graph. The key is node_type, edge_type, node_num, edge_num, node_feature_type and edge_feature_type.
- random_walk(target_nodes, meta_path, step_home_param=1.0, step_away_param=1.0, default_node=- 1)[source]
Random walk in nodes.
- Parameters
step_home_param (float, optional) – return hyper parameter in node2vec algorithm (Default = 1.0).
step_away_param (float, optional) – inout hyper parameter in node2vec algorithm (Default = 1.0).
default_node (int, optional) – default node if no more neighbors found (Default = -1). A default value of -1 indicates that no node is given.
- Returns
numpy.ndarray, array of nodes.
Examples
>>> import mindspore.dataset as ds >>> >>> data_graph = ds.GraphData('dataset_file', 2) >>> nodes = data_graph.random_walk([1,2], [1,2,1,2,1])