mindspore.profiler.profile
- class mindspore.profiler.profile(activities: list = None, with_stack: bool = False, profile_memory: bool = False, data_process: bool = False, parallel_strategy: bool = False, start_profile: bool = True, hbm_ddr: bool = False, pcie: bool = False, sync_enable: bool = True, schedule: Schedule = None, on_trace_ready: Optional[Callable[..., Any]] = None, experimental_config: Optional[_ExperimentalConfig] = None)
This class to enable the profiling of MindSpore neural networks. MindSpore users can import the mindspore.profiler.profile, initialize the profile object to start profiling, Use profile.start() to start the analysis, and use profile.stop() to stop collecting and analyzing the results. Users can visualize the results using the MindStudio Insight tool. Now, profile supports AICORE operator, AICPU operator, HostCPU operator, memory, correspondence, cluster, etc data analysis.
- Parameters
start_profile (bool, optional) – The start_profile parameter controls whether to enable or disable performance data collection based on conditions. Default:
True
.activities (list, optional) –
The activities to collect. Default:
[ProfilerActivity.CPU, ProfilerActivity.NPU]
.ProfilerActivity.CPU: Collect MindSpore framework data.
ProfilerActivity.NPU: Collect CANN software stack and NPU data.
ProfilerActivity.GPU: Collect GPU data.
schedule (schedule, optional) – Sets the action strategy for the capture, defined by the schedule class, to be used with the step interface. Default:
None
.on_trace_ready (Callable, optional) – Sets the callback function to be executed when the performance data is collected. Default:
None
.profile_memory (bool, optional) – (Ascend only) Whether to collect tensor memory data, collect when
True
. When using this parameter, activities must set to[ProfilerActivity.CPU, ProfilerActivity.NPU]
. Collecting operator memory data when the graph compilation level is O2 requires collecting from the first step. Default:False
. The operator name currently collected by this parameter is incomplete. This issue will be resolved in later versions. It is recommended to use the environment variableMS_ALLOC_CONF
instead.with_stack (bool, optional) – (Ascend only) Whether to collect frame host call stack data on the Python side. This data is presented in the form of a flame graph in the timeline. When using this parameter, activities must include
ProfilerActivity.CPU
. Default value:False
.hbm_ddr (bool, optional) – (Ascend only) Whether to collect On-Chip Memory/DDR read and write rate data, collect when True. Default:
False
.pcie (bool, optional) – (Ascend only) Whether to collect PCIe bandwidth data, collect when True. Default:
False
.data_process (bool, optional) – (Ascend/GPU) Whether to collect data to prepare performance data. Default value:
False
.parallel_strategy (bool, optional) – (Ascend only) Whether to collect parallel policy performance data. Default value:
False
.sync_enable (bool, optional) –
(GPU only) Whether the profiler collects operators in a synchronous way. Default:
True
.True: The synchronous way. Before sending the operator to the GPU, the CPU records the start timestamp. Then the operator is returned to the CPU after execution, and the end timestamp is recorded, The duration of the operator is the difference between the two timestamps.
False: The asynchronous way. The duration of the operator is that of sending from the CPU to the GPU. This method can reduce the impact of adding profiler on overall training time.
experimental_config (_ExperimentalConfig, optional) – expandable parameters can be configured in this configuration item.
- Raises
RuntimeError – When the version of CANN does not match the version of MindSpore, MindSpore cannot parse the generated ascend_job_id directory structure.
- Supported Platforms:
Ascend
GPU
Examples
>>> import numpy as np >>> import mindspore >>> from mindspore import nn, context >>> import mindspore.dataset as ds >>> from mindspore.profiler import ProfilerLevel, ProfilerActivity, AicoreMetrics, ExportType >>> >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.fc = nn.Dense(2,2) ... def construct(self, x): ... return self.fc(x) >>> >>> def generator(): ... for i in range(2): ... yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32) >>> >>> def train(net): ... optimizer = nn.Momentum(net.trainable_params(), 1, 0.9) ... loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) ... data = ds.GeneratorDataset(generator, ["data", "label"]) ... model = mindspore.train.Model(net, loss, optimizer) ... model.train(1, data) >>> >>> if __name__ == '__main__': ... # If the device_target is GPU, set the device_target to "GPU" ... context.set_context(mode=mindspore.GRAPH_MODE) ... mindspore.set_device("Ascend") ... ... # Init Profiler ... experimental_config = mindspore.profiler._ExperimentalConfig( ... profiler_level=ProfilerLevel.Level0, ... aic_metrics=AicoreMetrics.AiCoreNone, ... l2_cache=False, ... mstx=False, ... data_simplification=False, ... export_type=[ExportType.Text]) ... steps = 10 ... net = Net() ... # Note that the Profiler should be initialized before model.train ... with mindspore.profiler.profile(activities=[ProfilerActivity.CPU, ProfilerActivity.NPU], ... schedule=mindspore.profiler.schedule(wait=1, warmup=1, active=2, ... repeat=1, skip_first=2), ... on_trace_ready=mindspore.profiler. ... tensorboard_trace_handler("./data"), ... profile_memory=False, ... experimental_config=experimental_config) as prof: ... ... # Train Model ... for step in range(steps): ... train(net) ... prof.step()
- add_metadata(key: str, value: str)
Report custom metadata key-value pair data.
Examples
>>> import mindspore >>> # Profiler init. >>> with mindspore.profiler.profile() as prof: >>> # Call Profiler add_metadata >>> prof.add_metadata("test_key", "test_value")
- add_metadata_json(key: str, value: str)
Report custom metadata key-value pair data with the value as a JSON string data.
- Parameters
Examples
>>> import json >>> import mindspore >>> # Profiler init. >>> with mindspore.profiler.profile() as prof: >>> # Call Profiler add_metadata_json >>> prof.add_metadata_json("test_key", json.dumps({"key1": 1, "key2": 2}))
- start()
Turn on profile data collection. profile can be turned on by condition.
- Raises
RuntimeError – If the profile has already started.
RuntimeError – If the start_profile parameter is not set or is set to
True
.
Examples
>>> import numpy as np >>> import mindspore >>> from mindspore import nn, context >>> import mindspore.dataset as ds >>> from mindspore.profiler import ProfilerLevel, ProfilerActivity, AicoreMetrics, ExportType >>> >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.fc = nn.Dense(2,2) ... def construct(self, x): ... return self.fc(x) >>> >>> def generator(): ... for i in range(2): ... yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32) >>> >>> def train(net): ... optimizer = nn.Momentum(net.trainable_params(), 1, 0.9) ... loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) ... data = ds.GeneratorDataset(generator, ["data", "label"]) ... model = mindspore.train.Model(net, loss, optimizer) ... model.train(1, data) >>> >>> if __name__ == '__main__': ... # If the device_target is GPU, set the device_target to "GPU" ... context.set_context(mode=mindspore.GRAPH_MODE) ... mindspore.set_device("Ascend") ... ... # Init Profiler ... experimental_config = mindspore.profiler._ExperimentalConfig( ... profiler_level=ProfilerLevel.Level0, ... aic_metrics=AicoreMetrics.AiCoreNone, ... l2_cache=False, ... mstx=False, ... data_simplification=False, ... export_type=[ExportType.Text]) ... steps = 10 ... net = Net() ... # Note that the Profiler should be initialized before model.train ... prof = mindspore.profiler.profile(activities=[ProfilerActivity.CPU, ProfilerActivity.NPU], ... schedule=mindspore.profiler.schedule(wait=1, warmup=1, active=2, ... repeat=1, skip_first=2), ... on_trace_ready=mindspore.profiler. ... tensorboard_trace_handler("./data"), ... profile_memory=False, ... experimental_config=experimental_config) ... prof.start() ... # Train Model ... for step in range(steps): ... train(net) ... prof.step() ... prof.stop()
- step()
Used for Ascend, distinguish step collection and parsing performance data through schedule and on_trace_ready.
- Raises
RuntimeError – If the start_profile parameter is not set or the Profiler is not started.
RuntimeError – If the schedule parameter is not set.
Examples: >>> import numpy as np >>> import mindspore >>> from mindspore import nn, context >>> import mindspore.dataset as ds >>> from mindspore.profiler import ProfilerLevel, ProfilerActivity, AicoreMetrics, ExportType >>> >>> class Net(nn.Cell): … def __init__(self): … super(Net, self).__init__() … self.fc = nn.Dense(2,2) … def construct(self, x): … return self.fc(x) >>> >>> def generator(): … for i in range(2): … yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32) >>> >>> def train(net): … optimizer = nn.Momentum(net.trainable_params(), 1, 0.9) … loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) … data = ds.GeneratorDataset(generator, ["data", "label"]) … model = mindspore.train.Model(net, loss, optimizer) … model.train(1, data) >>> >>> if __name__ == '__main__': … # If the device_target is GPU, set the device_target to "GPU" … context.set_context(mode=mindspore.GRAPH_MODE) … mindspore.set_device("Ascend") … … # Init Profiler … experimental_config = mindspore.profiler._ExperimentalConfig( … profiler_level=ProfilerLevel.Level0, … aic_metrics=AicoreMetrics.AiCoreNone, … l2_cache=False, … mstx=False, … data_simplification=False, … export_type=[ExportType.Text]) … steps = 10 … net = Net() … # Note that the Profiler should be initialized before model.train … with mindspore.profiler.profile(activities=[ProfilerActivity.CPU, ProfilerActivity.NPU], … schedule=mindspore.profiler.schedule(wait=1, warmup=1, active=2, … repeat=1, skip_first=2), … on_trace_ready=mindspore.profiler.tensorboard_trace_handler("./data"), … profile_memory=False, … experimental_config=experimental_config) as prof: … … # Train Model … for step in range(steps): … train(net) … prof.step()
- stop()
Turn off profile data collection. profile can be turned off by condition.
- Raises
RuntimeError – If the profile has not started, this function is disabled.
Examples
>>> import numpy as np >>> import mindspore >>> from mindspore import nn, context >>> import mindspore.dataset as ds >>> from mindspore.profiler import ProfilerLevel, ProfilerActivity, AicoreMetrics, ExportType >>> >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.fc = nn.Dense(2,2) ... def construct(self, x): ... return self.fc(x) >>> >>> def generator(): ... for i in range(2): ... yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32) >>> >>> def train(net): ... optimizer = nn.Momentum(net.trainable_params(), 1, 0.9) ... loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) ... data = ds.GeneratorDataset(generator, ["data", "label"]) ... model = mindspore.train.Model(net, loss, optimizer) ... model.train(1, data) >>> >>> if __name__ == '__main__': ... # If the device_target is GPU, set the device_target to "GPU" ... context.set_context(mode=mindspore.GRAPH_MODE) ... mindspore.set_device("Ascend") ... ... # Init Profiler ... experimental_config = mindspore.profiler._ExperimentalConfig( ... profiler_level=ProfilerLevel.Level0, ... aic_metrics=AicoreMetrics.AiCoreNone, ... l2_cache=False, ... mstx=False, ... data_simplification=False, ... export_type=[ExportType.Text]) ... steps = 10 ... net = Net() ... # Note that the Profiler should be initialized before model.train ... prof = mindspore.profiler.profile(activities=[ProfilerActivity.CPU, ProfilerActivity.NPU], ... schedule=mindspore.profiler.schedule(wait=1, warmup=1, active=2, ... repeat=1, skip_first=2), ... on_trace_ready=mindspore.profiler. ... tensorboard_trace_handler("./data"), ... profile_memory=False, ... experimental_config=experimental_config) ... prof.start() ... # Train Model ... for step in range(steps): ... train(net) ... prof.step() ... prof.stop()