mindspore.profiler.DynamicProfilerMonitor

查看源文件
class mindspore.profiler.DynamicProfilerMonitor(cfg_path, output_path='./dyn_profile_data', poll_interval=2, **kwargs)[源代码]

该类用于动态采集MindSpore神经网络性能数据。

参数:
  • cfg_path (str) - 动态profile的json配置文件文件夹路径。要求该路径是能够被所有节点访问到的共享目录。json配置文件相关参数如下。

    • start_step (int, 必选) - 设置Profiler开始采集的步数,为相对值,训练的第一步为1。默认值-1,表示在整个训练流程不会开始采集。

    • stop_step (int, 必选) - 设置Profiler开始停止的步数,为相对值,训练的第一步为1,需要满足stop_step大于等于start_step。默认值-1,表示在整个训练流程不会开始采集。

    • aicore_metrics (int, 可选) - 设置采集AI Core指标数据,取值范围与Profiler一一对应。默认值-1,表示不采集AI Core指标,0代表PipeUtilization;1代表ArithmeticUtilization;2代表Memory;3代表MemoryL0;4代表MemoryUB;5代表ResourceConflictRatio;6代表L2Cache。

    • profiler_level (int, 可选) - 设置采集性能数据级别,0代表ProfilerLevel.Level0,1代表ProfilerLevel.Level1,2代表ProfilerLevel.Level2。默认值0,表示ProfilerLevel.Level0的采集级别。

    • activities (int, 可选) - 设置采集性能数据的设备,0代表CPU+NPU,1代表CPU,2代表NPU。默认值0,表示采集CPU+NPU的性能数据。

    • analyse_mode (int, 可选) - 设置在线解析的模式,对应mindspore.Profiler.analyse接口的analyse_mode参数,0代表"sync",1代表"async"。默认值-1,表示不使用在线解析。

    • parallel_strategy (bool, 可选) - 设置是否采集并行策略性能数据,true代表采集,false代表不采集。默认值false,表示不采集并行策略性能数据。

    • with_stack (bool, 可选) - 设置是否采集调用栈信息,true代表采集,false代表不采集。默认值false,表示不采集调用栈。

    • data_simplification (bool, 可选) - 设置开启数据精简,true代表开启,false代表不开启。默认值true,表示开启数据精简。

  • output_path (str, 可选) - 动态profile的输出文件路径。默认值:"./dyn_profile_data"

  • poll_interval (int, 可选) - 监控进程的轮询周期,单位为秒。默认值:2

异常:
  • RuntimeError - 创建监控进程失败次数超过最大限制。

支持平台:

Ascend GPU

样例:

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import nn
>>> import mindspore.dataset as ds
>>> from mindspore.profiler import DynamicProfilerMonitor
>>>
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self.fc = nn.Dense(2,2)
...     def construct(self, x):
...         return self.fc(x)
>>>
>>> def generator():
...     for i in range(2):
...         yield (np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32))
>>>
>>> def train(net):
...     optimizer = nn.Momentum(net.trainable_params(), 1, 0.9)
...     loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True)
...     data = ds.GeneratorDataset(generator, ["data", "label"])
...     dynprof_cb = DynamicProfilerMonitor(cfg_path="./dyn_cfg", output_path="./dyn_prof_data")
...     model = ms.train.Model(net, loss, optimizer)
...     # register DynamicProfilerMonitor to model.train()
...     model.train(10, data, callbacks=[dynprof_cb])
step()

用于在Ascend设备上,区分step收集和解析性能数据。

异常:
  • RuntimeError - 如果 start_step 参数设置大于 stop_step 参数设置 。

样例:

>>> import json
>>> import os
>>> import numpy as np
>>>
>>> import mindspore
>>> import mindspore.dataset as ds
>>> from mindspore import context, nn
>>> from mindspore.profiler import DynamicProfilerMonitor
>>>
>>>
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self.fc = nn.Dense(2, 2)
...
...     def construct(self, x):
...         return self.fc(x)
>>>
>>> def generator_net():
...     for _ in range(2):
...         yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32)
>>>
>>> def train(test_net):
...     optimizer = nn.Momentum(test_net.trainable_params(), 1, 0.9)
...     loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True)
...     data = ds.GeneratorDataset(generator_net(), ["data", "label"])
...     model = mindspore.train.Model(test_net, loss, optimizer)
...     model.train(1, data)
>>>
>>> def change_cfg_json(json_path):
...     with open(json_path, 'r', encoding='utf-8') as file:
...          data = json.load(file)
...
...     data['start_step'] = 6
...     data['stop_step'] = 7
...
...     with open(json_path, 'w', encoding='utf-8') as file:
...          json.dump(data, file, ensure_ascii=False, indent=4)
>>>
>>> if __name__ == '__main__':
...      # set json configuration file
...      cfg_json = {
...          "start_step": 2,
...          "stop_step": 5,
...          "aicore_metrics": -1,
...          "profiler_level": 0,
...          "activities": 0,
...          "analyse_mode": -1,
...          "parallel_strategy": False,
...          "with_stack": False,
...          "data_simplification": True,
...          }
...      context.set_context(mode=mindspore.PYNATIVE_MODE)
...      mindspore.set_device("Ascend")
...      cfg_path = os.path.join("./cfg_path", "profiler_config.json")
...      # set cfg file
...      with open(cfg_path, 'w') as f:
...           json.dump(cfg_json, f, indent=4)
...      # Assume the user has correctly configured the environment variable (RANK_ID is not a non-numeric type)
...      rank_id = int(os.getenv('RANK_ID')) if os.getenv('RANK_ID') else 0
...      # cfg_path contains the json configuration file path, and output_path is the output path
...      dp = DynamicProfilerMonitor(cfg_path=cfg_path, output_path=cfg_path)
...      STEP_NUM = 15
...      # Define a network of training models
...      net = Net()
...      for i in range(STEP_NUM):
...          print(f"step {i}")
...          train(net)
...          # Modify the configuration file after step 7. For example, change start_step to 8 and stop_step to 10
...          if i == 7:
...             # Modify parameters in the JSON file
...             change_cfg_json(os.path.join(cfg_path, "profiler_config.json"))
...             # Call step collection
...             dp.step()