mindspore.profiler._ExperimentalConfig

View Source On Gitee
class mindspore.profiler._ExperimentalConfig(profiler_level: ProfilerLevel = ProfilerLevel.Level0, aic_metrics: AicoreMetrics = AicoreMetrics.AiCoreNone, l2_cache: bool = False, mstx: bool = False, data_simplification: bool = True, export_type: list = None)[source]

The purpose of this class is to configure scalable parameters when using profiles for model performance data acquisition.

Parameters
  • profiler_level (ProfilerLevel, optional) –

    (Ascend only) The level of profiling. Default: ProfilerLevel.Level0.

    • ProfilerLevel.LevelNone: This setting takes effect only when mstx is enabled, indicating that no operator data is collected on the device side.

    • ProfilerLevel.Level0: Leanest level of profiling data collection, collects information about the elapsed time of the computational operators on the NPU and communication large operator information.

    • ProfilerLevel.Level1: Collect more CANN layer AscendCL data and AICore performance metrics and communication mini operator information based on Level0.

    • ProfilerLevel.Level2: Collect GE and Runtime information in CANN layer on top of Level1

  • aic_metrics (AicoreMetrics, optional) –

    (Ascend only) Types of AICORE performance data collected, when using this parameter, activities must include ProfilerActivity.NPU , and the value must be a member of AicoreMetrics. When profiler_level is Level0, the default value is AicoreMetrics.AiCoreNone; Profiler_level is a Level1 or Level2 stores, the default value is: AicoreMetrics. PipeUtilization.The data items contained in each metric are as follows:

    • AicoreMetrics.AiCoreNone: Does not collect AICORE data.

    • AicoreMetrics.ArithmeticUtilization: ArithmeticUtilization contains mac_fp16/int8_ratio, vec_fp32/fp16/int32_ratio, vec_misc_ratio etc.

    • AicoreMetrics.PipeUtilization: PipeUtilization contains vec_ratio, mac_ratio, scalar_ratio, mte1/mte2/mte3_ratio, icache_miss_rate etc.

    • AicoreMetrics.Memory: Memory contains ub_read/write_bw, l1_read/write_bw, l2_read/write_bw, main_mem_read/write_bw etc.

    • AicoreMetrics.MemoryL0: MemoryL0 contains l0a_read/write_bw, l0b_read/write_bw, l0c_read/write_bw etc.

    • AicoreMetrics.ResourceConflictRatio: ResourceConflictRatio contains vec_bankgroup/bank/resc_cflt_ratio etc.

    • AicoreMetrics.MemoryUB: MemoryUB contains ub_read/write_bw_mte, ub_read/write_bw_vector, ub_/write_bw_scalar etc.

    • AicoreMetrics.L2Cache: L2Cache contains write_cache_hit, write_cache_miss_allocate, r0_read_cache_hit, r1_read_cache_hit etc. This function only supports Atlas A2 training series products.

    • AicoreMetrics.MemoryAccess: Statistics on storage access bandwidth and storage capacity of main storage and l2 cache etc.

  • l2_cache (bool, optional) – (Ascend only) Whether to collect l2 cache data, collect when True. Default: False . The l2_cache.csv file is generated in the ASCEND_PROFILER_OUTPUT folder.In O2 mode, only wait and skip_first parameters in schedule configuration can be set to 0.

  • mstx (bool, optional) – (Ascend only) Whether to collect light weight profiling data, collect when True. Default: False .

  • data_simplification (bool, optional) – (Ascend only) Whether to remove FRAMEWORK data and other redundant data. If set to True, only the profiler deliverables and raw performance data under the PROF_XXX directory are kept to save space. Default value: True .

  • export_type (list, optional) –

    (Ascend only) The data type to export. The db and text formats can be exported at the same time. The default value is None, indicating that data of the text type is exported.

    • ExportType.Text: Export text type data.

    • ExportType.Db: Export db type data.

Raises

RuntimeError – When the version of CANN does not match the version of MindSpore, MindSpore cannot parse the generated ascend_job_id directory structure.

Supported Platforms:

Ascend GPU

Examples

>>> import numpy as np
>>> import mindspore
>>> from mindspore import nn, context
>>> import mindspore.dataset as ds
>>> from mindspore.profiler import ProfilerLevel, ProfilerActivity, AicoreMetrics, ExportType
>>>
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self.fc = nn.Dense(2,2)
...     def construct(self, x):
...         return self.fc(x)
>>>
>>> def generator():
...     for i in range(2):
...         yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32)
>>>
>>> def train(net):
...     optimizer = nn.Momentum(net.trainable_params(), 1, 0.9)
...     loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True)
...     data = ds.GeneratorDataset(generator, ["data", "label"])
...     model = mindspore.train.Model(net, loss, optimizer)
...     model.train(1, data)
>>>
>>> if __name__ == '__main__':
...     # If the device_target is GPU, set the device_target to "GPU"
...     context.set_context(mode=mindspore.GRAPH_MODE)
...     mindspore.set_device("Ascend")
...
...     # Init Profiler
...     experimental_config = mindspore.profiler._ExperimentalConfig(
...                                 profiler_level=ProfilerLevel.Level0,
...                                 aic_metrics=AicoreMetrics.AiCoreNone,
...                                 l2_cache=False,
...                                 mstx=False,
...                                 data_simplification=False,
...                                 export_type=[ExportType.Text])
...     steps = 10
...     net = Net()
...     # Note that the Profiler should be initialized before model.train
...     with mindspore.profiler.profile(activities=[ProfilerActivity.CPU, ProfilerActivity.NPU],
...                                     schedule=mindspore.profiler.schedule(wait=1, warmup=1, active=2,
...                                           repeat=1, skip_first=2),
...                                     on_trace_ready=mindspore.profiler.tensorboard_trace_handler("./data"),
...                                     profile_memory=False,
...                                     experimental_config=experimental_config) as prof:
...
...         # Train Model
...         for step in range(steps):
...             train(net)
...             prof.step()