mindspore.profiler.DynamicProfilerMonitor

class mindspore.profiler.DynamicProfilerMonitor(cfg_path, output_path='./dyn_profile_data', poll_interval=2, **kwargs)[source]

This class to enable the dynamic profile monitoring of MindSpore neural networks.

Parameters

cfg_path (str) –
Dynamic profile json config file directory. The requirement is a shared path that can be accessed by all nodes. The parameters of the json configuration file are as follows:
- start_step (int, required) - Sets the step number at which the Profiler starts collecting data. It is a relative value, with the first step of training being 1. The default value is -1, indicating that data collection will not start during the entire training process.
- stop_step (int, required) - Sets the step number at which the Profiler stops collecting data. It is a relative value, with the first step of training being 1. The stop_step must be greater than or equal to start_step. The default value is -1, indicating that data collection will not start during the entire training process.
- aicore_metrics (int, optional) - The range of values corresponds to the Profiler. The default value -1 indicates that AI Core utilization is not collected, and 0 indicates PipeUtilization, 1 indicates ArithmeticUtilization, 2 stands for Memory, 3 stands for MemoryL0, 4 stands for MemoryUB, 5 indicates ResourceConflictRatio, 6 indicates L2Cache.
- profiler_level (int, optional) - Sets the level of performance data collection, where 0 represents ProfilerLevel.Level0, 1 represents ProfilerLevel.Level1, and 2 represents ProfilerLevel.Level2. The default value is 0, indicating the ProfilerLevel.Level0 collection level.
- activities (int, optional) - Sets the devices for performance data collection, where 0 represents CPU+NPU, 1 represents CPU, and 2 represents NPU. The default value is 0, indicating the collection of CPU+NPU performance data.
- analyse_mode (int, optional) - Sets the mode for online analysis, corresponding to the analyse_mode parameter of the mindspore.Profiler.analyse interface, where 0 represents "sync" and 1 represents "async". The default value is -1, indicating that online analysis is not used.
- parallel_strategy (bool, optional) - Sets whether to collect parallel strategy performance data, where true means to collect and false means not to collect. The default value is false, indicating that parallel strategy performance data is not collected.
- with_stack (bool, optional) - Sets whether to collect call stack information, where true means to collect and false means not to collect. The default value is false, indicating that call stack information is not collected.
- data_simplification (bool, optional) - Sets whether to enable data simplification, where true means to enable and false means not to enable. The default value is true, indicating that data simplification is enabled.
output_path (str, optional) – Output data path. Default: "./dyn_profile_data" .
poll_interval (int, optional) – The polling period of the monitoring process, in seconds. Default value: 2.

Raises

RuntimeError – When create shared memory times exceeds max times.

Supported Platforms:: Ascend GPU

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import nn
>>> import mindspore.dataset as ds
>>> from mindspore.profiler import DynamicProfilerMonitor
>>>
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self.fc = nn.Dense(2,2)
...     def construct(self, x):
...         return self.fc(x)
>>>
>>> def generator():
...     for i in range(2):
...         yield (np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32))
>>>
>>> def train(net):
...     optimizer = nn.Momentum(net.trainable_params(), 1, 0.9)
...     loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True)
...     data = ds.GeneratorDataset(generator, ["data", "label"])
...     dynprof_cb = DynamicProfilerMonitor(cfg_path="./dyn_cfg", output_path="./dyn_prof_data")
...     model = ms.train.Model(net, loss, optimizer)
...     # register DynamicProfilerMonitor to model.train()
...     model.train(10, data, callbacks=[dynprof_cb])

step()

Used for Ascend, distinguish step collection and parsing performance data by dynamic profiler.

Raises: RuntimeError – If the 'start_step' parameter setting is greater than the 'stop_step' parameter setting.

Examples

>>> import json
>>> import os
>>> import numpy as np
>>>
>>> import mindspore
>>> import mindspore.dataset as ds
>>> from mindspore import context, nn
>>> from mindspore.profiler import DynamicProfilerMonitor
>>>
>>>
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self.fc = nn.Dense(2, 2)
...
...     def construct(self, x):
...         return self.fc(x)
>>>
>>> def generator_net():
...     for _ in range(2):
...         yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32)
>>>
>>> def train(test_net):
...     optimizer = nn.Momentum(test_net.trainable_params(), 1, 0.9)
...     loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True)
...     data = ds.GeneratorDataset(generator_net(), ["data", "label"])
...     model = mindspore.train.Model(test_net, loss, optimizer)
...     model.train(1, data)
>>>
>>> def change_cfg_json(json_path):
...     with open(json_path, 'r', encoding='utf-8') as file:
...          data = json.load(file)
...
...     data['start_step'] = 6
...     data['stop_step'] = 7
...
...     with open(json_path, 'w', encoding='utf-8') as file:
...          json.dump(data, file, ensure_ascii=False, indent=4)
>>>
>>> if __name__ == '__main__':
...      # set json configuration file
...      cfg_json = {
...          "start_step": 2,
...          "stop_step": 5,
...          "aicore_metrics": -1,
...          "profiler_level": 0,
...          "activities": 0,
...          "analyse_mode": -1,
...          "parallel_strategy": False,
...          "with_stack": False,
...          "data_simplification": True,
...          }
...      context.set_context(mode=mindspore.PYNATIVE_MODE)
...      mindspore.set_device("Ascend")
...      cfg_path = os.path.join("./cfg_path", "profiler_config.json")
...      # set cfg file
...      with open(cfg_path, 'w') as f:
...           json.dump(cfg_json, f, indent=4)
...      # Assume the user has correctly configured the environment variable (RANK_ID is not a non-numeric type)
...      rank_id = int(os.getenv('RANK_ID')) if os.getenv('RANK_ID') else 0
...      # cfg_path contains the json configuration file path, and output_path is the output path
...      dp = DynamicProfilerMonitor(cfg_path=cfg_path, output_path=cfg_path)
...      STEP_NUM = 15
...      # Define a network of training models
...      net = Net()
...      for i in range(STEP_NUM):
...          print(f"step {i}")
...          train(net)
...          # Modify the configuration file after step 7. For example, change start_step to 8 and stop_step to 10
...          if i == 7:
...             # Modify parameters in the JSON file
...             change_cfg_json(os.path.join(cfg_path, "profiler_config.json"))
...             # Call step collection
...             dp.step()