mindspore.profiler.DynamicProfilerMonitor
- class mindspore.profiler.DynamicProfilerMonitor(cfg_path=None, output_path='./dyn_profile_data', poll_interval=2, **kwargs)[source]
This class to enable the dynamic profile monitoring of MindSpore neural networks.
- Parameters
cfg_path (str) –
(Ascend only) Dynamic profile json config file directory. The requirement is a shared path that can be accessed by all nodes. The parameters of the json configuration file are as follows:
start_step (int, required) - Sets the step number at which the Profiler starts collecting data. It is a relative value, with the first step of training being 1. The default value is -1, indicating that data collection will not start during the entire training process.
stop_step (int, required) - Sets the step number at which the Profiler stops collecting data. It is a relative value, with the first step of training being 1. The stop_step must be greater than or equal to start_step. The default value is -1, indicating that data collection will not start during the entire training process.
aic_metrics (int/str, optional) - Set the collection of AI Core metric data. The current version can pass in either type int or str. Later, it will be updated to only pass in the str type. Here,
0
and"PipeUtilization"
represent PipeUtilization;1
and"ArithmeticUtilization"
represent ArithmeticUtilization;2
and"Memory"
represent Memory;3
and"MemoryL0"
represent MemoryL0;4
and"MemoryUB"
stand for MemoryUB;5
and"ResourceConflictRatio"
represent ResourceConflictRatio;6
and"L2Cache"
represent L2Cache;7
and"MemoryAccess"
stand for MemoryAccess. The default value"AiCoreNone"
indicates that the AI Core metric is not collected.profiler_level (int/str, optional) - Set the level for collecting performance data. The current version can pass in either type int or str, and it will be updated to only pass in str type in the future. Among them,
-1
and"LevelNone"
represent ProfilerLevel.LevelNone,0
and"Level0"
represent ProfilerLevel.Level0, and1
and"Level1"
represent ProfilerLevel.Level1.2
and"Level2"
stand for Profile Level.Level2. The default value"Level0"
indicates the collection level of ProfilerLevel.Level0.activities (int/list, optional) - Set the device for collecting performance data. The current version can pass in either type int or list. Later, it will be updated to only pass in the list type. Among them,
0
and["CPU","NPU"]
represent CPU+NPU,1
and["CPU"]
represent CPU, and2
and["NPU"]
represent NPU. The default values["CPU","NPU"]
indicate the collection of performance data of CPU+NPU.export_type (int/list, optional) - Set the type of the exported performance data. The current version can pass in either type int or list, and it will be updated later to only pass in the list type. Among them,
0
and["text"]
represent text,1
and["db"]
represent db, and2
and["text","db"]
represent text and db respectively. The default value["text"]
indicates that only performance data of the text type is exported.profile_memory (bool, optional) - Set whether to collect memory performance data, true indicates that memory performance data is collected, false indicates that memory performance data is not collected. The default value is false, indicating that memory performance data is not collected.
mstx (bool, optional) - Set whether to enable mstx, true indicates that mstx is enabled, false indicates that mstx is disabled. The default value is false, indicating that mstx is not enabled.
analyse (bool, optional) - Set whether to enable online analysis. True indicates that online analysis is enabled, while false indicates that online analysis is disabled. The default value is false, indicating that online analysis is not enabled.
analyse_mode (int, optional) - Sets the mode for online analysis, corresponding to the analyse_mode parameter of the mindspore.Profiler.analyse interface, where 0 represents "sync" and 1 represents "async". The default value is -1, indicating that online analysis is not used.
parallel_strategy (bool, optional) - Sets whether to collect parallel strategy performance data, where true means to collect and false means not to collect. The default value is false, indicating that parallel strategy performance data is not collected.
with_stack (bool, optional) - Sets whether to collect call stack information, where true means to collect and false means not to collect. The default value is false, indicating that call stack information is not collected.
data_simplification (bool, optional) - Sets whether to enable data simplification, where true means to enable and false means not to enable. The default value is true, indicating that data simplification is enabled.
record_shapes (bool, optional) - Sets whether to collect operator input tensor shapes data, where true means that the shape data is collected and false means that the shape data is not collected. The default value is false, indicating that input tensor shapes data is not collected.
mstx_domain_include (list, optional) - Set the set of enabled domain names when the mstx switch is turned on. The name must be of str type. Default value:
[]
, indicating that this parameter is not used to control the domain. This parameter is mutually exclusive with the mstx_domain_exclude parameter and cannot be set. simultaneously. If both are set, only the mstx_domain_include parameter takes effect.mstx_domain_exclude (list, optional) - Set the set of domain names that are not enabled when the mstx switch is turned on. The name must be of str type. Default value:
[]
, indicating that this parameter is not used to control the domain.prof_path (str, optional) - Output data path of the dynamic profiler. It is the same as the interface parameter output_path. When both are set, prof_path takes effect. Default value:
"./dyn_profile_data"
.sys_io (bool, optional) - Set whether to collect NIC and RoCE data. Default value:
False
, indicating that these data are not collected.sys_interconnection (bool, optional) - Set whether to collect system interconnection data, including aggregate collective communication statistics (HCCS), PCIe data, and inter-chip transmission bandwidth information. Default value:
False
, indicating that these data are not collected.host_sys (list, optional) - Collect the data of system class calls, storage classes and cpu usage rate on the host side, and pass in the list type. It supports passing in one or more of
"cpu"
,"mem"
,"disk"
,"network"
and"osrt"
. Among them,"cpu"
represents the cpu utilization at the process level,"mem"
represents the memory utilization at the process level,"disk"
represents the disk I/O utilization at the process level, and"network"
represents the network I/O utilization at the system level."osrt"
represents system-level syscall and pthreadcall. Default value:[]
, indicating that system class data on the host side is not collected.
output_path (str, optional) – (Ascend only) Output data path. Default:
"./dyn_profile_data"
.poll_interval (int, optional) – (Ascend only) The polling period of the monitoring process, in seconds. Default value:
2
.
- Raises
RuntimeError – When create shared memory times exceeds max times.
- Supported Platforms:
Ascend
GPU
Examples
>>> import numpy as np >>> import mindspore as ms >>> from mindspore import nn >>> import mindspore.dataset as ds >>> from mindspore.profiler import DynamicProfilerMonitor >>> >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.fc = nn.Dense(2,2) ... def construct(self, x): ... return self.fc(x) >>> >>> def generator(): ... for i in range(2): ... yield (np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32)) >>> >>> def train(net): ... optimizer = nn.Momentum(net.trainable_params(), 1, 0.9) ... loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) ... data = ds.GeneratorDataset(generator, ["data", "label"]) ... dynprof_cb = DynamicProfilerMonitor(cfg_path="./dyn_cfg", output_path="./dyn_prof_data") ... model = ms.train.Model(net, loss, optimizer) ... # register DynamicProfilerMonitor to model.train() ... model.train(10, data, callbacks=[dynprof_cb])
- step()[source]
Used for Ascend, distinguish step collection and parsing performance data by dynamic profiler.
- Raises
RuntimeError – If the 'start_step' parameter setting is greater than the 'stop_step' parameter setting.
Examples
>>> import json >>> import os >>> import numpy as np >>> >>> import mindspore >>> import mindspore.dataset as ds >>> from mindspore import context, nn >>> from mindspore.profiler import DynamicProfilerMonitor >>> >>> >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.fc = nn.Dense(2, 2) ... ... def construct(self, x): ... return self.fc(x) >>> >>> def generator_net(): ... for _ in range(2): ... yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32) >>> >>> def train(test_net): ... optimizer = nn.Momentum(test_net.trainable_params(), 1, 0.9) ... loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) ... data = ds.GeneratorDataset(generator_net(), ["data", "label"]) ... model = mindspore.train.Model(test_net, loss, optimizer) ... model.train(1, data) >>> >>> def change_cfg_json(json_path): ... with open(json_path, 'r', encoding='utf-8') as file: ... data = json.load(file) ... ... data['start_step'] = 6 ... data['stop_step'] = 7 ... ... with open(json_path, 'w', encoding='utf-8') as file: ... json.dump(data, file, ensure_ascii=False, indent=4) >>> >>> if __name__ == '__main__': ... # set json configuration file ... context.set_context(mode=mindspore.PYNATIVE_MODE) ... mindspore.set_device("Ascend") ... data_cfg = { ... "start_step": 2, ... "stop_step": 5, ... "aic_metrics": -1, ... "profiler_level": 0, ... "activities": 0, ... "export_type": 0, ... "profile_memory": False, ... "mstx": False, ... "analyse_mode": 0, ... "parallel_strategy": False, ... "with_stack": False, ... "data_simplification": True, ... } ... output_path = "./cfg_path" ... cfg_path = os.path.join(output_path, "profiler_config.json") ... os.makedirs(output_path, exist_ok=True) ... # set cfg file ... with open(cfg_path, 'w') as f: ... json.dump(data_cfg, f, indent=4) ... # cfg_path contains the json configuration file path, and output_path is the output path ... dp = DynamicProfilerMonitor(cfg_path=output_path, output_path=output_path) ... STEP_NUM = 15 ... # Define a network of training models ... net = Net() ... for i in range(STEP_NUM): ... print(f"step {i}") ... train(net) ... # Modify the configuration file after step 7 ... # For example, change start_step to 8 and stop_step to 10 ... if i == 5: ... # Modify parameters in the JSON file ... change_cfg_json(os.path.join(output_path, "profiler_config.json")) ... # Call step collection ... dp.step()