在MCU或小型系统上执行推理或训练
概述
本教程介绍面向IoT边缘侧设备的超轻量AI部署方案。
相较于移动设备,IoT设备上通常使用MicroControllerUnits(MCUs),不仅设备系统ROM资源非常有限,而且硬件资源内存和算力都非常弱小。
因此IOT设备上的AI应用对AI模型推理的运行时内存和功耗都有严格限制。
MindSpore Lite针对MCUs部署硬件后端,提供了一种超轻量Micro AI部署解决方案:离线阶段直接将模型生成轻量化代码,不再需要在线解析模型和图编译,生成的Micro代码非常直观易懂,运行时内存小,代码体积也更小。
用户使用MindSpore Lite转换工具converter_lite
非常容易生成可在x86/ARM64/ARM32/Cortex-M平台部署的推理或训练代码。
通过Micro部署一个模型进行推理或训练,通常包含以下四步:模型代码生成、Micro
库获取、代码集成、编译部署。
模型推理代码生成
概述
通过MindSpore Lite转换工具converter_lite
,并在转换工具的参数配置文件中,配置Micro配置项,就能为输入模型生成推理代码。
此章只介绍转换工具中生成代码的相关功能,关于转换工具的基本使用方法,请参考推理模型转换。
环境准备
以Linux环境下使用转换工具为例,需要进行如下环境准备。
转换工具运行所需的系统环境
本例采用Linux下的系统环境,推荐使用Ubuntu 18.04.02LTS。
获取转换工具
可以通过两种方式获取转换工具:
解压下载的包
tar -zxf mindspore-lite-${version}-linux-x64.tar.gz
${version}是发布包的版本号。
将转换工具运行时需要的动态链接库加入环境变量LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${PACKAGE_ROOT_PATH}/tools/converter/lib:${LD_LIBRARY_PATH}
${PACKAGE_ROOT_PATH}是解压得到的文件夹路径。
单模型生成推理代码
进入转换目录
cd ${PACKAGE_ROOT_PATH}/tools/converter/converter
设置Micro配置项
在当前目录下新建
micro.cfg
文件,文件内容如下:[micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, AMR32A, ARM64 only. target=x86 # enable parallel inference or not. support_parallel=false
配置文件中,第一行的
[micro_param]
表明后续的变量参数属于Micro配置项micro_param
,这些参数用于控制代码生成,各参数含义如下表1所示。 本例中,我们将生成适用底层架构为x86_64的Linux系统上的单模型推理代码,故设置target=x86
以声明生成的推理代码将用于底层架构为x86_64的Linux系统。准备要生成推理代码的模型
用户可点击此处下载本例中用到的MNIST手写数字识别模型。 下载后,解压包,得到
mnist.tflite
,该模型为已经训练完的MNIST分类模型,为TFLITE模型。将mnist.tflite
模型拷贝到当前所在的转换工具目录。执行converter_lite,生成代码
./converter_lite --fmk=TFLITE --modelFile=mnist.tflite --outputFile=mnist --configFile=micro.cfg
运行成功后的结果显示为:
CONVERT RESULT SUCCESS:0
用户若想了解converter_lite转换工具的相关参数,可参考converter参数说明。
在转换工具执行成功后,生成的代码被保存在用户指定的
outputFile
路径下,在本例中,为当前转换目录下的mnist文件夹,内容如下:mnist # 指定的生成代码根目录名称 ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── calib_output.c │ ├── calib_output.h │ ├── load_input.c │ └── load_input.h ├── CMakeLists.txt # benchmark例程的cmake工程文件 └── src # 模型推理代码目录 ├── model0 # 与模型相关的文件目录 ├── model0.c ├── net0.bin # 二进制形式的模型权重 ├── net0.c ├── net0.h ├── weight0.c ├── weight0.h ├── CMakeLists.txt ├── allocator.c ├── allocator.h ├── net.cmake ├── model.c ├── model.h ├── context.c ├── context.h ├── tensor.c ├── tensor.h
生成代码中的
src
目录即为模型推理代码所在目录,benchmark
只是对src
目录代码进行集成调用的一个例程。 关于集成调用的更多详细说明,请参照代码集成及编译部署章节。
表1:micro_param参数定义
参数 |
是否必选 |
参数说明 |
取值范围 |
默认值 |
---|---|---|---|---|
enable_micro |
是 |
模型会生成代码,否则生成.ms |
true, false |
false |
target |
是 |
生成代码针对的平台 |
x86, Cortex-M, ARM32, ARM64 |
x86 |
support_parallel |
否 |
是否生成多线程推理代码,仅在x86、ARM32、ARM64平台可设置为true |
true, false |
false |
save_path |
否(多模型参数) |
多模型生成代码文件路径 |
无 |
无 |
project_name |
否(多模型参数) |
多模型生成代码工程名 |
无 |
无 |
inputs_shape |
否(动态shape参数) |
动态shape场景下模型的输入shape信息 |
无 |
无 |
dynamic_dim_params |
否(动态shape参数) |
动态shape场景下可变维度的取值范围 |
无 |
无 |
多模型生成推理代码
进入转换目录
cd ${PACKAGE_ROOT_PATH}/tools/converter/converter
设置Micro配置项
在当前目录下新建
micro.cfg
文件,文件内容如下:[micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, AMR32A, ARM64 only. target=x86 # enable parallel inference or not. support_parallel=false # save generated code path. save_path=workpath/ # set project name. project_name=minst [model_param] # input model type. fmk=TFLITE # path of input model file. modelFile=mnist.tflite [model_param] # input model type. fmk=TFLITE # path of input model file. modelFile=mnist.tflite
配置文件中,
[micro_param]
表明后续的变量参数属于Micro配置项micro_param
,这些参数用于控制代码生成,各参数含义如表1所示。[model_param]
表明后续的变量参数属于对应Model配置项model_param
,这些参数用于控制不同模型的转换,参数的范围包括converter_lite
支持的必要参数。 本例中,我们将生成适用底层架构为x86_64的Linux系统上的多模型推理代码,故设置target=x86
以声明生成的推理代码将用于底层架构为x86_64的Linux系统。准备要生成推理代码的模型
用户可点击此处下载本例中用到的MNIST手写数字识别模型。 下载后,解压包,得到
mnist.tflite
,该模型为已经训练完的MNIST分类模型,为TFLITE模型。将mnist.tflite
模型拷贝到当前所在的转换工具目录。执行converter_lite,只需要配置config文件,生成代码
./converter_lite --configFile=micro.cfg
运行成功后的结果显示为:
CONVERT RESULT SUCCESS:0
用户若想了解converter_lite转换工具的相关参数,可参考converter参数说明。
在转换工具执行成功后,生成的代码被保存在用户指定的
save_path
+project_name
路径下,在本例中,为当前转换目录下的mnist文件夹,内容如下:mnist # 指定的生成代码根目录(工程)名称 ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── calib_output.c │ ├── calib_output.h │ ├── load_input.c │ └── load_input.h ├── CMakeLists.txt # benchmark例程的cmake工程文件 ├── include ├── model_handle.h # 模型对外接口文件 └── src # 模型推理代码目录 ├── model0 # 第一个模型相关的文件目录 ├── model0.c ├── net0.bin # 二进制形式的模型权重 ├── net0.c ├── net0.h ├── weight0.c ├── weight0.h ├── model1 # 第二个模型相关的文件目录 ├── model1.c ├── net1.bin ├── net1.c ├── net1.h ├── weight1.c ├── weight1.h ├── CMakeLists.txt ├── allocator.c ├── allocator.h ├── net.cmake ├── model.c ├── model.h ├── context.c ├── context.h ├── tensor.c ├── tensor.h
生成代码中的
src
目录即为模型推理代码所在目录,benchmark
只是对src
目录代码进行集成调用的一个例程,在多模型场景下,用户需根据自己的需求对benchmark
进行微调。 关于集成调用的更多详细说明,请参照代码集成及编译部署章节。
模型输入shape配置(可选)
通常在生成代码时,通过配置模型输入shape为实际推理时的输入shape,可以减少部署过程中出错的概率。
当模型含有Shape
算子或者原模型输入shape非固定值时,必须配置模型的输入shape值,以支持相关shape优化和代码生成。
通过转换工具的--inputShape=
命令可以配置生成代码的输入shape,具体参数含义,请参考转换工具使用说明。
动态shape配置(可选)
在某些推理场景,如检测出目标后再执行目标识别网络,由于目标个数不固定导致目标识别网络输入BatchSize不固定。如果每次推理都按照需要的BatchSize或分辨率进行重新生成和部署,会造成内存资源浪费和开发效率降低。因此,Micro需要支持动态shape能力,在convert阶段通过configFile配置[micro_param]
中的动态参数,推理时使用MSModelResize功能,改变输入shape。
其中,inputs_shape
中配置模型的所有输入shape信息,固定维度用真实数字表示,动态维度用占位符表示,目前仅支持配置2个可变维度。dynamic_dim_params
表示可变维度的取值范围,需与inputs_shape
配置的占位符对应;如果范围为离散值,则用,
隔开,如果范围为连续值,则用~
隔开。所有参数均为紧凑书写,中间不要留有空格;若存在多个输入,不同输入对应的挡位需要一致,并用;
隔开,否则解析失败。
[micro_param]
# the name and shapes of model's all inputs.
# the format is 'input1_name:[d0,d1];input2_name:[1,d0]'
inputs_shape=input1:[d0,d1];input2:[1,d0]
# the value range of dynamic dims.
dynamic_dim_params=d0:[1,3];d1:[1~8]
生成多线程并行推理代码(可选)
在通常的Linux-x86/Android环境下,拥有多核CPU,使能Micro多线程推理能够发挥设备性能,加快模型推理速度。
配置文件
通过在配置文件中设置support_parallel为true,将生成支持多线程推理的代码,关于配置文件各选项含义请参考表1。
一个 x86
的多线程代码生成配置文件的示例如下:
[micro_param]
# enable code-generation for MCU HW
enable_micro=true
# specify HW target, support x86,Cortex-M, AMR32A, ARM64 only.
target=x86
# enable parallel inference or not.
support_parallel=true
涉及的调用接口
通过集成代码,并调用下述接口,用户可以配置模型的多线程推理,具体接口参数请参考API文档。
表2:多线程配置API接口
功能 |
函数原型 |
---|---|
设置推理时线程数 |
void MSContextSetThreadNum(MSContextHandle context, int32_t thread_num) |
设置线程绑核模式 |
void MSContextSetThreadAffinityMode(MSContextHandle context, int mode) |
获取推理时线程数 |
int32_t MSContextGetThreadNum(const MSContextHandle context); |
获取线程绑核模式 |
int MSContextGetThreadAffinityMode(const MSContextHandle context) |
集成说明
生成多线程代码后,用户需链接pthread
标准库,以及Micro库内的libwrapper.a
静态库。具体可参考生成代码中的CMakeLists.txt
文件。
限制说明
目前该功能仅在 target
配置为x86/ARM32/ARM64时使能,最大可设置推理线程数为4线程。
生成int8量化推理代码(可选)
在Cortex-M等MCU场景下,受限于设备的内存大小及算力,通常需要使用int8量化算子来进行部署推理以减少运行时内存大小并加速运算。
如果用户已经有一个int8全量化模型,可参考执行converter_lite生成推理代码章节尝试直接生成int8量化推理代码而不需要阅读本章内容。 在通常的情况下,用户只有一个训练好的float32模型,此时若要生成int8量化推理代码,则需配合转换工具的后量化功能进行代码生成,具体步骤可参考下文。
配置文件
通过在配置文件中配置量化控制参数可以实现int8量化推理代码生成,关于量化控制参数(通用量化参数common_quant_param
和全量化参数full_quant_param
)的说明,请参考转换工具的量化文档。
一个 Cortex-M
平台的int8量化推理代码生成配置文件的示例如下:
[micro_param]
# enable code-generation for MCU HW
enable_micro=true
# specify HW target, support x86,Cortex-M, ARM32, ARM64 only.
target=Cortex-M
# code generation for Inference or Train
codegen_mode=Inference
# enable parallel inference or not
support_parallel=false
[common_quant_param]
# Supports WEIGHT_QUANT or FULL_QUANT
quant_type=FULL_QUANT
# Weight quantization support the number of bits [0,16], Set to 0 is mixed bit quantization, otherwise it is fixed bit quantization
# Full quantization support the number of bits [1,8]
bit_num=8
[data_preprocess_param]
calibrate_path=inputs:/home/input_dir
calibrate_size=100
input_type=BIN
[full_quant_param]
activation_quant_method=MAX_MIN
bias_correction=true
target_device=DSP
限制说明
目前仅支持全量化的推理代码生成。
配置文件中全量化参数
full_quant_param
的target_device通常需设置为DSP,以支持更多的算子进行后量化。目前Micro已支持8个int8量化算子(add、batchnorm、concat、conv、convolution、matmul、resize、slice),如果在生成代码时,有相关量化算子不支持,可通过通用量化参数
common_quant_param
的skip_quant_node
来规避该算子,被规避的算子节点仍然采用float32推理。
模型训练代码生成
概述
通过MindSpore Lite转换工具converter_lite
,并在转换工具的参数配置文件中,配置Micro配置项,就能为输入模型生成训练代码。
此章只介绍转换工具中生成代码的相关功能,关于转换工具的基本使用方法,请参考训练模型转换。
环境准备
环境准备小节参考上文,此处不再赘述。
执行converter_lite生成推理代码
进入转换目录
cd ${PACKAGE_ROOT_PATH}/tools/converter/converter
设置Micro配置项
在当前目录下新建
micro.cfg
文件,文件内容如下:[micro_param] # enable code-generation for MCU HW enable_micro=true # specify HW target, support x86,Cortex-M, AMR32A, ARM64 only. target=x86 # code generation for Inference or Train. Cortex-M is unsupported when codegen_mode is Train. codegen_mode=Train
执行converter_lite,生成代码
./converter_lite --fmk=MINDIR --trainModel=True --modelFile=my_model.mindir --outputFile=my_model --configFile=micro.cfg
运行成功后的结果显示为:
CONVERT RESULT SUCCESS:0
在转换工具执行成功后,生成的代码被保存在用户指定的
outputFile
路径下,在本例中,为当前转换目录下的my_model文件夹,内容如下:my_model # 指定的生成代码根目录名称 ├── benchmark # 对模型训练代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── calib_output.c │ ├── calib_output.h │ ├── load_input.c │ └── load_input.h ├── CMakeLists.txt # benchmark例程的cmake工程文件 └── src # 模型推理代码目录 ├── CMakeLists.txt ├── net.bin # 二进制形式的模型权重 ├── net.c ├── net.cmake ├── net.h ├── model.c ├── context.c ├── context.h ├── tensor.c ├── tensor.h ├── weight.c └── weight.h
训练执行流程涉及的API请参考训练接口介绍
Micro
库获取
在生成模型推理代码之后,用户在对代码进行集成开发之前,需要获得生成的推理代码所依赖的Micro
库。
不同平台的推理代码依赖对应平台的Micro
库,用户需根据使用的平台,在生成代码时,通过Micro配置项target
指定该平台,并在获取Micro
库时,获得该平台的Micro
库。
用户可通过MindSpore官网下载对应平台的Release版本。
在模型推理代码生成章节,我们得到了x86_64架构Linux平台的模型推理代码,而该代码所依赖的Micro
库,就在转换工具所使用的发布包内。
发布包内,推理代码所依赖的库和头文件如下:
mindspore-lite-{version}-linux-x64
├── runtime
│ └── include
│ └── c_api # MindSpore Lite集成开发的C API头文件
└── tools
└── codegen # 代码生成的source code 依赖include和lib
├── include # 推理框架头文件
│ ├── nnacl # nnacl 算子头文件
│ └── wrapper # wrapper 算子头文件
├── lib
│ ├── libwrapper.a # MindSpore Lite codegen生成代码依赖的部分算子静态库
│ └── libnnacl.a # MindSpore Lite codegen生成代码依赖的nnacl算子静态库
└── third_party
├── include
│ └── CMSIS # ARM CMSIS NN 算子头文件
└── lib
└── libcmsis_nn.a # ARM CMSIS NN 算子静态库
代码集成及编译部署
在生成代码的benchmark
目录中,包含了对推理代码的接口调用示例,用户可参考benchmark例程,来对src
推理代码进行集成开发以实现自身的应用。
推理代码的调用接口
以下是推理代码的一般调用接口,关于接口的详细说明,请参考API文档。
表3:推理通用API接口
功能 |
函数原型 |
---|---|
创建 Model |
MSModelHandle MSModelCreate() |
销毁 Model |
void MSModelDestroy(MSModelHandle *model) |
计算 Model 运行时所需的缓存大小(仅支持Cortex-M平台) |
size_t MSModelCalcWorkspaceSize(MSModelHandle model) |
设置 Model 运行时的缓存(仅支持Cortex-M平台) |
void MSModelSetWorkspace(MSModelHandle model, void *workspace, size_t workspace_size) |
编译 Model |
MSStatus MSModelBuild(MSModelHandle model, const void *model_data, size_t data_size, MSModelType model_type, const MSContextHandle model_context) |
设置 Model 的输入shape |
MSStatus MSModelResize(MSModelHandle model, const MSTensorHandleArray inputs, MSShapeInfo *shape_infos, size_t shape_info_num) |
推理 Model |
MSStatus MSModelPredict(MSModelHandle model, const MSTensorHandleArray inputs, MSTensorHandleArray *outputs, const MSKernelCallBackC before, const MSKernelCallBackC after) |
获取所有输入 Tensor |
MSTensorHandleArray MSModelGetInputs(const MSModelHandle model) |
获取所有输出 Tensor |
MSTensorHandleArray MSModelGetOutputs(const MSModelHandle model) |
通过名字取输入 Tensor |
MSTensorHandle MSModelGetInputByTensorName(const MSModelHandle model, const char *tensor_name) |
通过名字取输出 Tensor |
MSTensorHandle MSModelGetOutputByTensorName(const MSModelHandle model, const char *tensor_name) |
训练代码的调用接口
以下是训练代码的一般调用接口。
表4:训练通用API接口(此处只列举训练相关接口)
功能 |
函数原型 |
---|---|
单步执行 Model |
MSStatus MSModelRunStep(MSModelHandle model, const MSKernelCallBackC before, const MSKernelCallBackC after) |
设置执行模式 Model |
MSStatus MSModelSetTrainMode(MSModelHandle model, bool train) |
权重导出 Model |
MSStatus MSModelExportWeight(MSModelHandle model,const char *export_path) |
不同的平台的集成差异
不同的平台在代码集成和编译部署上会有不同的差异。
对于cortex-M架构的MCU请参考在MCU上执行推理
对于x86_64架构Linux平台,请参考Linux_x86_64平台编译部署
对于arm32或arm64的Android平台编译部署,请参考Android平台编译部署
对于在OpenHarmony平台上编译部署,请参考在轻鸿蒙设备上执行推理
多模型推理集成
多模型集成与单模型的类似。唯有一点不同:单模型场景下,用户可通过MSModelCreate
接口创建模型。而在多模型场景下,为用户提供了MSModelHandle
句柄,用户可通过操纵不同模型的MSModelHandle
句柄,调用单模型通用的推理API接口,实现对不同模型的集成,MSModelHandle
句柄可参考多模型文件目录下的model_handle.h
文件。
在MCU上执行推理
概述
本教程以MNIST模型在STM32F767芯片的部署为例,演示如何在Cortex-M架构的MCU上部署推理模型,主要包括以下几步:
通过converter_lite转换工具,生成适配Cortex-M架构的模型推理代码
下载得到该Cortex-M架构对应的
Micro
库对得到的推理代码和
Micro
库进行集成,编译并部署验证在Windows平台,我们演示了如何通过IAR进行推理代码的集成开发,在Linux平台上,我们演示了如何通过MakeFile交叉编译的方式进行代码集成开发。
生成MCU推理代码
为MCU生成推理代码,请参考模型推理代码生成章节,只需将Micro配置项中的target=x86
改为target=Cortex-M
,就可以为MCU生成推理代码。
生成成功之后,文件夹内容如下所示:
mnist # 指定的生成代码根目录名称
├── benchmark # 对模型推理代码进行集成调用的benchmark例程
│ ├── benchmark.c
│ ├── calib_output.c
│ ├── calib_output.h
│ ├── data.c
│ ├── data.h
│ ├── load_input.c
│ └── load_input.h
├── build.sh # 一键编译脚本
├── CMakeLists.txt # benchmark例程的cmake工程文件
├── cortex-m7.toolchain.cmake # cortex-m7的交叉编译cmake文件
└── src # 模型推理代码目录
├── CMakeLists.txt
├── context.c
├── context.h
├── model.c
├── net.c
├── net.cmake
├── net.h
├── tensor.c
├── tensor.h
├── weight.c
└── weight.h
下载Cortex-M架构Micro
库
STM32F767芯片为Cortex-M7架构,可以通过以下两种方式获取该架构的Micro
库:
MindSpore官网下载Release版本。
用户需下载操作系统为None,硬件平台为Cortex-M7的发布包。
从源码开始编译构建。
用户可通过
MSLITE_MICRO_PLATFORM=cortex-m7 bash build.sh -I x86_64
命令,来编译得到Cortex-M7
的发布包。
对于暂未提供发布包进行下载的其他Cortex-M架构平台,用户可参考从源码编译构建的方式,修改MindSpore源码,进行手动编译,得到发布包。
在Windows上的代码集成及编译部署:通过IAR进行集成开发
本例通过IAR进行代码集成及烧录,演示如何在Windows上对生成的推理代码进行集成开发。主要分为以下几步:
下载所需要的相关软件,做好集成的环境准备
通过
STM32CubeMX
软件生成所需要的MCU启动代码及演示工程在
IAR
内集成模型推理代码及Micro
库编译并仿真运行
环境准备
STM32CubeMX Windows版本 >= 6.0.1
STM32CubeMX
是意法半导体
提供的STM32芯片图形化配置工具,该工具用于生成STM芯片的启动代码及工程。
IAR EWARM >= 9.1
IAR EWARM
是一款IARSystems公司为ARM微处理器开发的一个集成开发环境。
获取MCU启动代码及工程
如果用户已经有自己的MCU工程,请忽略该章节。
本章,以生成STM32F767芯片的启动工程为例,演示如何通过STM32CubeMX
生成STM32芯片的MCU工程。
启动
STM32CubeMX
,在File
选项中选择New Project
来新建工程。在
MCU/MPU Selector
窗口,搜索并选择STM32F767IGT6
,点击Start Project
创建该芯片的工程。在
Project Manager
界面,配置工程名及生成的工程路径,在Toolchain / IDE
选项选择EWARM
,以指定生成IAR工程。点击上方的
GENERATE CODE
生成代码。在已安装
IAR
的PC机上,双击生成工程内EWARM
目录下的Project.eww
即可打开该IAR工程。
集成模型推理代码及Micro
库
将生成的推理代码拷贝到工程内,并将下载Cortex-M架构
Micro
库章节获得的压缩包解压后放到生成的推理代码目录内,目录如下图所示:test_stm32f767 # MCU工程目录 ├── Core │ ├── Inc │ └── Src │ ├── main.c │ └── ... ├── Drivers ├── EWARM # IAR工程文件目录 └── mnist # 生成代码根目录 ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ ├── benchmark.c │ ├── data.c │ ├── data.h │ └── ... │── mindspore-lite-1.8.0-none-cortex-m7 # 下载的Cortex-M7架构`Micro`库 ├── src # 模型推理代码目录 └── ...
向IAR工程导入源文件
打开IAR工程,在
Workspace
界面,右键该项目,选择Add -> Add Group
,添加一个mnist
分组,右键点击该分组,重复新建分组操作,新建src
和benchmark
分组。 在各自分组下,选择Add -> Add Files
,将mnist
文件夹内src
和benchmark
目录下的源文件引入各自分组。加入依赖的头文件路径和静态库
在
Workspace
界面,右键该项目,选择Options
,打开项目选项窗口。在项目选项窗口左侧选择C/C++ Compiler
选项,在右侧的子窗口中,选择Preprocessor
子界面,将推理代码依赖的头文件路径加入到列表中。本例中添加的头文件路径如下:$PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/runtime $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/runtime/include $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/include $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/include/CMSIS/Core $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/include/CMSIS/DSP $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/include/CMSIS/NN $PROJ_DIR$/../mnist
在项目选项窗口左侧选择
Linker
选项,在右侧的子窗口中,选择Library
子界面,将推理代码依赖的算子静态库文件加入到列表中。本例中添加的静态库文件如下:$PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/lib/libwrapper.a $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/lib/libnnacl.a $PROJ_DIR$/../mnist/mindspore-lite-1.8.0-none-cortex-m7/tools/codegen/third_party/lib/libcmsis_nn.a
修改main.c文件,调用benchmark函数
在
main.c
开头增加头文件引用,并在main函数中调用benchmark.c
中的benchmark
函数,benchmark文件夹内的程序为对生成的src
内的推理代码进行推理调用并比较输出的示范样例程序,用户可以自由对它进行修改。#include "benchmark/benchmark.h" ... int main(void) { ... if (benchmark() == 0) { printf("\nrun success.\n"); } else { printf("\nrun failed.\n"); } ... }
修改
mnist/benchmark/data.c
文件,将标杆输入输出数据存放在程序内以进行对比在benchmark例程内,会设置模型的输入数据,并将推理结果和设定的期望结果进行对比,得到误差偏移值。 在本例中,通过修改
data.c
的calib_input0_data
数组,设置模型的输入数据,通过修改calib_output0_data
,设定期望结果。float calib_input0_data[NET_INPUT0_SIZE] = {0.54881352186203,0.7151893377304077,0.6027633547782898,0.5448831915855408,0.42365479469299316,0.6458941102027893,0.4375872015953064,0.891772985458374,0.9636627435684204,0.3834415078163147,0.7917250394821167,0.5288949012756348,0.5680445432662964,0.9255966544151306,0.07103605568408966,0.08712930232286453,0.020218396559357643,0.832619845867157,0.7781567573547363,0.8700121641159058,0.978618323802948,0.7991585731506348,0.4614793658256531,0.7805292010307312,0.11827442795038223,0.6399210095405579,0.14335328340530396,0.9446688890457153,0.5218483209609985,0.4146619439125061,0.26455560326576233,0.7742336988449097,0.4561503231525421,0.568433940410614,0.018789799883961678,0.6176354885101318,0.6120957136154175,0.6169340014457703,0.9437480568885803,0.681820273399353,0.35950788855552673,0.43703195452690125,0.6976311802864075,0.0602254718542099,0.6667667031288147,0.670637845993042,0.21038256585597992,0.12892629206180573,0.31542834639549255,0.36371076107025146,0.5701967477798462,0.4386015236377716,0.9883738160133362,0.10204481333494186,0.20887675881385803,0.16130951046943665,0.6531082987785339,0.25329160690307617,0.4663107693195343,0.24442559480667114,0.15896958112716675,0.11037514358758926,0.6563295722007751,0.13818295300006866,0.1965823620557785,0.3687251806259155,0.8209932446479797,0.09710127860307693,0.8379449248313904,0.0960984081029892,0.9764594435691833,0.4686512053012848,0.9767611026763916,0.6048455238342285,0.7392635941505432,0.03918779268860817,0.28280696272850037,0.12019655853509903,0.296140193939209,0.11872772127389908,0.3179831802845001,0.414262980222702,0.06414749473333359,0.6924721002578735,0.5666014552116394,0.26538950204849243,0.5232480764389038,0.09394051134586334,0.5759465098381042,0.9292961955070496,0.3185689449310303,0.6674103736877441,0.13179786503314972,0.7163271903991699,0.28940609097480774,0.18319135904312134,0.5865129232406616,0.02010754682123661,0.8289400339126587,0.004695476032793522,0.6778165102005005,0.2700079679489136,0.7351940274238586,0.9621885418891907,0.2487531453371048,0.5761573314666748,0.5920419096946716,0.5722519159317017,0.22308163344860077,0.9527490139007568,0.4471253752708435,0.8464086651802063,0.6994792819023132,0.2974369525909424,0.8137978315353394,0.396505743265152,0.8811032176017761,0.5812729001045227,0.8817353844642639,0.6925315856933594,0.7252542972564697,0.5013243556022644,0.9560836553573608,0.6439902186393738,0.4238550364971161,0.6063932180404663,0.019193198531866074,0.30157482624053955,0.6601735353469849,0.2900775969028473,0.6180154085159302,0.42876869440078735,0.1354740709066391,0.29828232526779175,0.5699648857116699,0.5908727645874023,0.5743252635002136,0.6532008051872253,0.6521032452583313,0.43141844868659973,0.8965466022491455,0.36756187677383423,0.4358649253845215,0.8919233679771423,0.806194007396698,0.7038885951042175,0.10022688657045364,0.9194825887680054,0.7142413258552551,0.9988470077514648,0.14944830536842346,0.8681260347366333,0.16249293088912964,0.6155595779418945,0.1238199844956398,0.8480082154273987,0.8073189854621887,0.5691007375717163,0.40718328952789307,0.06916699558496475,0.6974287629127502,0.45354267954826355,0.7220556139945984,0.8663823008537292,0.9755215048789978,0.855803370475769,0.011714084073901176,0.359978049993515,0.729990541934967,0.17162968218326569,0.5210366249084473,0.054337989538908005,0.19999653100967407,0.01852179504930973,0.793697714805603,0.2239246815443039,0.3453516662120819,0.9280812740325928,0.704414427280426,0.031838931143283844,0.1646941602230072,0.6214783787727356,0.5772286057472229,0.23789282143115997,0.9342139959335327,0.6139659285545349,0.5356327891349792,0.5899099707603455,0.7301220297813416,0.31194499135017395,0.39822107553482056,0.20984375476837158,0.18619300425052643,0.9443724155426025,0.739550769329071,0.49045881628990173,0.22741462290287018,0.2543564736843109,0.058029159903526306,0.43441662192344666,0.3117958903312683,0.6963434815406799,0.37775182723999023,0.1796036809682846,0.024678727611899376,0.06724963337182999,0.6793927550315857,0.4536968469619751,0.5365791916847229,0.8966712951660156,0.990338921546936,0.21689698100090027,0.6630781888961792,0.2633223831653595,0.02065099962055683,0.7583786249160767,0.32001715898513794,0.38346388936042786,0.5883170962333679,0.8310484290122986,0.6289818286895752,0.872650682926178,0.27354204654693604,0.7980468273162842,0.18563593924045563,0.9527916312217712,0.6874882578849792,0.21550767123699188,0.9473705887794495,0.7308558225631714,0.2539416551589966,0.21331197023391724,0.518200695514679,0.02566271834075451,0.20747007429599762,0.4246854782104492,0.3741699755191803,0.46357542276382446,0.27762871980667114,0.5867843627929688,0.8638556003570557,0.11753185838460922,0.517379105091095,0.13206811249256134,0.7168596982955933,0.39605969190597534,0.5654212832450867,0.1832798421382904,0.14484776556491852,0.4880562722682953,0.35561272501945496,0.9404319524765015,0.7653252482414246,0.748663604259491,0.9037197232246399,0.08342243731021881,0.5521924495697021,0.5844760537147522,0.961936354637146,0.29214751720428467,0.24082878232002258,0.10029394179582596,0.016429629176855087,0.9295293092727661,0.669916570186615,0.7851529121398926,0.28173011541366577,0.5864101648330688,0.06395526975393295,0.48562759160995483,0.9774951338768005,0.8765052556991577,0.3381589651107788,0.961570143699646,0.23170162737369537,0.9493188261985779,0.9413776993751526,0.799202561378479,0.6304479241371155,0.8742879629135132,0.2930202782154083,0.8489435315132141,0.6178767085075378,0.013236857950687408,0.34723350405693054,0.14814086258411407,0.9818294048309326,0.4783703088760376,0.49739137291908264,0.6394725441932678,0.36858460307121277,0.13690027594566345,0.8221177458763123,0.1898479163646698,0.5113189816474915,0.2243170291185379,0.09784448146820068,0.8621914982795715,0.9729194641113281,0.9608346819877625,0.9065554738044739,0.774047315120697,0.3331451416015625,0.08110138773918152,0.40724116563796997,0.2322341352701187,0.13248763978481293,0.053427182137966156,0.7255943417549133,0.011427458375692368,0.7705807685852051,0.14694663882255554,0.07952208071947098,0.08960303664207458,0.6720477938652039,0.24536721408367157,0.4205394685268402,0.557368814945221,0.8605511784553528,0.7270442843437195,0.2703278958797455,0.131482794880867,0.05537432059645653,0.3015986382961273,0.2621181607246399,0.45614057779312134,0.6832813620567322,0.6956254243850708,0.28351885080337524,0.3799269497394562,0.18115095794200897,0.7885454893112183,0.05684807524085045,0.6969972252845764,0.7786954045295715,0.7774075865745544,0.25942257046699524,0.3738131523132324,0.5875996351242065,0.27282190322875977,0.3708527982234955,0.19705428183078766,0.4598558843135834,0.044612299650907516,0.7997958660125732,0.07695644348859787,0.5188351273536682,0.3068101108074188,0.5775429606437683,0.9594333171844482,0.6455702185630798,0.03536243736743927,0.4304024279117584,0.5100168585777283,0.5361775159835815,0.6813924908638,0.2775960862636566,0.12886056303977966,0.3926756680011749,0.9564056992530823,0.1871308982372284,0.9039839506149292,0.5438059568405151,0.4569114148616791,0.8820413947105408,0.45860394835472107,0.7241676449775696,0.3990253210067749,0.9040443897247314,0.6900250315666199,0.6996220350265503,0.32772040367126465,0.7567786574363708,0.6360610723495483,0.2400202751159668,0.16053882241249084,0.796391487121582,0.9591665863990784,0.4581388235092163,0.5909841656684875,0.8577226400375366,0.45722344517707825,0.9518744945526123,0.5757511854171753,0.8207671046257019,0.9088436961174011,0.8155238032341003,0.15941447019577026,0.6288984417915344,0.39843425154685974,0.06271295249462128,0.4240322411060333,0.25868406891822815,0.849038302898407,0.03330462798476219,0.9589827060699463,0.35536885261535645,0.3567068874835968,0.01632850244641304,0.18523232638835907,0.40125951170921326,0.9292914271354675,0.0996149331331253,0.9453015327453613,0.869488537311554,0.4541623890399933,0.326700896024704,0.23274412751197815,0.6144647002220154,0.03307459130883217,0.015606064349412918,0.428795725107193,0.06807407736778259,0.2519409954547882,0.2211609184741974,0.253191202878952,0.13105523586273193,0.012036222964525223,0.11548429727554321,0.6184802651405334,0.9742562174797058,0.9903450012207031,0.40905410051345825,0.1629544198513031,0.6387617588043213,0.4903053343296051,0.9894098043441772,0.06530420482158661,0.7832344174385071,0.28839850425720215,0.24141861498355865,0.6625045537948608,0.24606318771839142,0.6658591032028198,0.5173085331916809,0.4240889847278595,0.5546877980232239,0.2870515286922455,0.7065746784210205,0.414856880903244,0.3605455458164215,0.8286569118499756,0.9249669313430786,0.04600730910897255,0.2326269894838333,0.34851935505867004,0.8149664998054504,0.9854914546012878,0.9689717292785645,0.904948353767395,0.2965562641620636,0.9920112490653992,0.24942004680633545,0.10590615123510361,0.9509525895118713,0.2334202527999878,0.6897682547569275,0.05835635960102081,0.7307090759277344,0.8817201852798462,0.27243688702583313,0.3790569007396698,0.3742961883544922,0.7487882375717163,0.2378072440624237,0.17185309529304504,0.4492916464805603,0.30446839332580566,0.8391891121864319,0.23774182796478271,0.5023894309997559,0.9425836205482483,0.6339976787567139,0.8672894239425659,0.940209686756134,0.7507648468017578,0.6995750665664673,0.9679655432701111,0.9944007992744446,0.4518216848373413,0.07086978107690811,0.29279401898384094,0.15235470235347748,0.41748636960983276,0.13128933310508728,0.6041178107261658,0.38280805945396423,0.8953858613967896,0.96779465675354,0.5468848943710327,0.2748235762119293,0.5922304391860962,0.8967611789703369,0.40673333406448364,0.5520782470703125,0.2716527581214905,0.4554441571235657,0.4017135500907898,0.24841345846652985,0.5058664083480835,0.31038081645965576,0.37303486466407776,0.5249704718589783,0.7505950331687927,0.3335074782371521,0.9241587519645691,0.8623185753822327,0.048690296709537506,0.2536425292491913,0.4461355209350586,0.10462789237499237,0.34847599267959595,0.7400975227355957,0.6805144548416138,0.6223844289779663,0.7105283737182617,0.20492368936538696,0.3416981101036072,0.676242470741272,0.879234790802002,0.5436780452728271,0.2826996445655823,0.030235258862376213,0.7103368043899536,0.007884103804826736,0.37267908453941345,0.5305371880531311,0.922111451625824,0.08949454873800278,0.40594232082366943,0.024313200265169144,0.3426109850406647,0.6222310662269592,0.2790679335594177,0.2097499519586563,0.11570323258638382,0.5771402716636658,0.6952700018882751,0.6719571352005005,0.9488610029220581,0.002703213831409812,0.6471966505050659,0.60039222240448,0.5887396335601807,0.9627703428268433,0.016871673986315727,0.6964824199676514,0.8136786222457886,0.5098071694374084,0.33396488428115845,0.7908401489257812,0.09724292904138565,0.44203564524650574,0.5199523568153381,0.6939564347267151,0.09088572859764099,0.2277594953775406,0.4103015661239624,0.6232946515083313,0.8869608044624329,0.618826150894165,0.13346147537231445,0.9805801510810852,0.8717857599258423,0.5027207732200623,0.9223479628562927,0.5413808226585388,0.9233060479164124,0.8298973441123962,0.968286395072937,0.919782817363739,0.03603381663560867,0.1747720092535019,0.3891346752643585,0.9521427154541016,0.300028920173645,0.16046763956546783,0.8863046765327454,0.4463944137096405,0.9078755974769592,0.16023047268390656,0.6611174941062927,0.4402637481689453,0.07648676633834839,0.6964631676673889,0.2473987489938736,0.03961552307009697,0.05994429811835289,0.06107853725552559,0.9077329635620117,0.7398838996887207,0.8980623483657837,0.6725823283195496,0.5289399027824402,0.30444636940956116,0.997962236404419,0.36218905448913574,0.47064894437789917,0.37824517488479614,0.979526937007904,0.1746583878993988,0.32798799872398376,0.6803486943244934,0.06320761889219284,0.60724937915802,0.47764649987220764,0.2839999794960022,0.2384132742881775,0.5145127177238464,0.36792758107185364,0.4565199017524719,0.3374773859977722,0.9704936742782593,0.13343943655490875,0.09680395573377609,0.3433917164802551,0.5910269021987915,0.6591764688491821,0.3972567617893219,0.9992780089378357,0.35189300775527954,0.7214066386222839,0.6375827193260193,0.8130538463592529,0.9762256741523743,0.8897936344146729,0.7645619511604309,0.6982485055923462,0.335498183965683,0.14768557250499725,0.06263600289821625,0.2419017106294632,0.432281494140625,0.521996259689331,0.7730835676193237,0.9587409496307373,0.1173204779624939,0.10700414329767227,0.5896947383880615,0.7453980445861816,0.848150372505188,0.9358320832252502,0.9834262132644653,0.39980170130729675,0.3803351819515228,0.14780867099761963,0.6849344372749329,0.6567619442939758,0.8620625734329224,0.09725799411535263,0.49777689576148987,0.5810819268226624,0.2415570467710495,0.16902540624141693,0.8595808148384094,0.05853492394089699,0.47062090039253235,0.11583399772644043,0.45705875754356384,0.9799623489379883,0.4237063527107239,0.857124924659729,0.11731556057929993,0.2712520658969879,0.40379273891448975,0.39981213212013245,0.6713835000991821,0.3447181284427643,0.713766872882843,0.6391869187355042,0.399161159992218,0.43176013231277466,0.614527702331543,0.0700421929359436,0.8224067091941833,0.65342116355896,0.7263424396514893,0.5369229912757874,0.11047711223363876,0.4050356149673462,0.40537357330322266,0.3210429847240448,0.029950324445962906,0.73725426197052,0.10978446155786514,0.6063081622123718,0.7032175064086914,0.6347863078117371,0.95914226770401,0.10329815745353699,0.8671671748161316,0.02919023483991623,0.534916877746582,0.4042436182498932,0.5241838693618774,0.36509987711906433,0.19056691229343414,0.01912289671599865,0.5181497931480408,0.8427768349647522,0.3732159435749054,0.2228638231754303,0.080532006919384,0.0853109210729599,0.22139644622802734,0.10001406073570251,0.26503971219062805,0.06614946573972702,0.06560486555099487,0.8562761545181274,0.1621202677488327,0.5596824288368225,0.7734555602073669,0.4564095735549927,0.15336887538433075,0.19959613680839539,0.43298420310020447,0.52823406457901,0.3494403064250946,0.7814795970916748,0.7510216236114502,0.9272118210792542,0.028952548280358315,0.8956912755966187,0.39256879687309265,0.8783724904060364,0.690784752368927,0.987348735332489,0.7592824697494507,0.3645446300506592,0.5010631680488586,0.37638914585113525,0.364911824464798,0.2609044909477234,0.49597030878067017,0.6817399263381958,0.27734026312828064,0.5243797898292542,0.117380291223526,0.1598452925682068,0.04680635407567024,0.9707314372062683,0.0038603513967245817,0.17857997119426727,0.6128667593002319,0.08136960119009018,0.8818964958190918,0.7196201682090759,0.9663899540901184,0.5076355338096619,0.3004036843776703,0.549500584602356,0.9308187365531921,0.5207614302635193,0.2672070264816284,0.8773987889289856,0.3719187378883362,0.0013833499979227781,0.2476850152015686,0.31823351979255676,0.8587774634361267,0.4585031569004059,0.4445872902870178,0.33610227704048157,0.880678117275238,0.9450267553329468,0.9918903112411499,0.3767412602901459,0.9661474227905273,0.7918795943260193,0.675689160823822,0.24488948285579681,0.21645726263523102,0.1660478264093399,0.9227566123008728,0.2940766513347626,0.4530942440032959,0.49395784735679626,0.7781715989112854,0.8442349433898926,0.1390727013349533,0.4269043505191803,0.842854917049408,0.8180332779884338}; float calib_output0_data[NET_OUTPUT0_SIZE] = {3.5647096e-05,6.824297e-08,0.009327697,3.2340475e-05,1.1117579e-05,1.5117058e-06,4.6314454e-07,5.161628e-11,0.9905911,3.8835238e-10};
编译并仿真运行
在本例中,采用软件仿真的方式对推理结果进行查看分析。
在Workspace
界面,右键该项目,选择Options
,打开项目选项窗口。在项目选项窗口左侧选择Debugger
选项,在右侧的Setup
子窗口中,Driver
选择为Simulator
,使能软件仿真。
关闭项目选项窗口,点击菜单栏Project -> Download and Debug
,进行项目编译并仿真。通过在benchmark
调用处加入断点,可以观察到仿真运行的推理结果,以及benchmark()函数的返回值。
在Linux上的代码集成及编译部署:通过MakeFile进行代码集成
本章教程以在Linux平台上,通过MakeFile对生成的模型代码进行集成开发为例,演示了在Linux上进行MCU推理代码集成开发的一般步骤。 主要分为以下几步:
下载所需要的相关软件,准备好交叉编译及烧录环境
通过
STM32CubeMX
软件生成所需要的MCU启动代码及演示工程修改
MakeFile
集成模型推理代码及Micro
库编译工程及烧录
读取板端运行结果并验证
本例构建完成的完整demo代码,可点击此处下载。
环境准备
CMake >= 3.18.3
GNU Arm Embedded Toolchain >= 10-2020-q4-major-x86_64-linux
该工具为适用Cortex-M的Linux下交叉编译工具。
下载x86_64-linux版本的
gcc-arm-none-eabi
,解压缩后,将目录下的bin路径加入到PATH环境变量中:export PATH=gcc-arm-none-eabi路径/bin:$PATH
。
STM32CubeMX-Lin >= 6.5.0
STM32CubeMX
是意法半导体
提供的STM32芯片图形化配置工具,该工具用于生成STM芯片的启动代码及工程。
STM32CubePrg-Lin >= 6.5.0
该工具是
意法半导体
提供的烧录工具,可用于程序烧录和数据读取。
获取MCU启动代码及工程
如果用户已经有自己的MCU工程,请忽略该章节。
本章,以生成STM32F767芯片的启动工程为例,演示如何通过STM32CubeMX
生成STM32芯片的MCU工程。
启动
STM32CubeMX
,在File
选项中选择New Project
来新建工程。在
MCU/MPU Selector
窗口,搜索并选择STM32F767IGT6
,点击Start Project
创建该芯片的工程。在
Project Manager
界面,配置工程名及生成的工程路径,在Toolchain / IDE
选项选择Makefile
,以👈指定生成MakeFile
工程。点击上方的
GENERATE CODE
生成代码在生成的工程目录下执行
make
,测试代码是否成功编译。
集成模型推理代码
将生成的推理代码拷贝到MCU工程内,并将下载Cortex-M架构
Micro
库章节获得的压缩包解压后放到生成的推理代码目录内,目录如下图所示:stm32f767 # MCU工程目录 ├── Core │ ├── Inc │ └── Src │ ├── main.c │ └── ... ├── Drivers ├── mnist # 生成代码根目录 │ ├── benchmark # 对模型推理代码进行集成调用的benchmark例程 │ │ ├── benchmark.c │ │ ├── data.c │ │ ├── data.h │ │ └── ... │ │── mindspore-lite-1.8.0-none-cortex-m7 # 下载的Cortex-M7架构`Micro`库 │ ├── src # 模型推理代码目录 │ └── ... ├── Makefile ├── startup_stm32f767xx.s └── STM32F767IGTx_FLASH.ld
修改
MakeFile
,将模型推理代码及依赖库加入工程本例中要加入工程的源代码包括
src
下的模型推理代码和benchmark
目录下的模型推理调用示例代码, 修改MakeFile
中的C_SOURCES
变量定义,加入源文件路径:C_SOURCES = \ mnist/src/context.c \ mnist/src/model.c \ mnist/src/net.c \ mnist/src/tensor.c \ mnist/src/weight.c \ mnist/benchmark/benchmark.c \ mnist/benchmark/calib_output.c \ mnist/benchmark/load_input.c \ mnist/benchmark/data.c \ ...
加入依赖的头文件路径:修改
MakeFile
中的C_INCLUDES
变量定义,加入以下路径:LITE_PACK = mindspore-lite-1.8.0-none-cortex-m7 C_INCLUDES = \ -Imnist/$(LITE_PACK)/runtime \ -Imnist/$(LITE_PACK)/runtime/include \ -Imnist/$(LITE_PACK)/tools/codegen/include \ -Imnist/$(LITE_PACK)/tools/codegen/third_party/include/CMSIS/Core \ -Imnist/$(LITE_PACK)/tools/codegen/third_party/include/CMSIS/DSP \ -Imnist/$(LITE_PACK)/tools/codegen/third_party/include/CMSIS/NN \ -Imnist \ ...
加入依赖的算子库(
-lnnacl -lwrapper -lcmsis_nn
),声明算子库文件所在路径,增加链接时编译选项(-specs=nosys.specs
)。 本例中修改后的相关变量定义如下:LIBS = -lc -lm -lnosys -lnnacl -lwrapper -lcmsis_nn LIBDIR = -Lmnist/$(LITE_PACK)/tools/codegen/lib -Lmnist/$(LITE_PACK)/tools/codegen/third_party/lib LDFLAGS = $(MCU) -specs=nosys.specs -specs=nano.specs -T$(LDSCRIPT) $(LIBDIR) $(LIBS) -Wl,-Map=$(BUILD_DIR)/$(TARGET).map,--cref -Wl,--gc-sections
修改main.c文件,调用benchmark函数
在main函数中调用
benchmark.c
中的benchmark
函数,benchmark文件夹内的程序为对生成的src
内的推理代码进行推理调用并比较输出的示范样例程序,用户可以自由对它进行修改,在本例中,我们直接调用benchmark
函数,并根据返回结果,赋值run_dnn_flag
变量。run_dnn_flag = '0'; if (benchmark() == 0) { printf("\nrun success.\n"); run_dnn_flag = '1'; } else { printf("\nrun failed.\n"); run_dnn_flag = '2'; }
在
main.c
开头增加头文件引用和run_dnn_flag
变量的定义。#include "benchmark/benchmark.h" char run_dnn_flag __attribute__((section(".myram"))) ;//测试用数组
在本例中,为方便直接使用烧录器对推理结果进行读取,把变量定义在了自定义的section段(
myram
)中,用户可以使用下面方式设置自定义的section段,或者忽略该声明,采用串口或其他交互方式得到推理结果。自定义section段的设置方法如下: 修改
STM32F767IGTx_FLASH.ld
中的MEMORY
段,增加一个自定义内存段MYRAM
(在本例中,将RAM
内存起始地址加4,以腾出内存给MYRAM
);接着在SECTIONS
段内增加自定义的myram
段声明。MEMORY { MYRAM (xrw) : ORIGIN = 0x20000000, LENGTH = 1 RAM (xrw) : ORIGIN = 0x20000004, LENGTH = 524284 ... } ... SECTIONS { ... .myram (NOLOAD): { . = ALIGN(4); _smyram = .; /* create a global symbol at data start */ *(.sram) /* .data sections */ *(.sram*) /* .data* sections */ . = ALIGN(4); _emyram = .; /* define a global symbol at data end */ } >MYRAM AT> FLASH }
修改
mnist/benchmark/data.c
文件,将标杆输入输出数据存放在程序内以进行对比在benchmark例程内,会设置模型的输入数据,并将推理结果和设定的期望结果进行对比,得到误差偏移值。 在本例中,通过修改
data.c
的calib_input0_data
数组,设置模型的输入数据,通过修改calib_output0_data
,设定期望结果。float calib_input0_data[NET_INPUT0_SIZE] = {0.54881352186203,0.7151893377304077,0.6027633547782898,0.5448831915855408,0.42365479469299316,0.6458941102027893,0.4375872015953064,0.891772985458374,0.9636627435684204,0.3834415078163147,0.7917250394821167,0.5288949012756348,0.5680445432662964,0.9255966544151306,0.07103605568408966,0.08712930232286453,0.020218396559357643,0.832619845867157,0.7781567573547363,0.8700121641159058,0.978618323802948,0.7991585731506348,0.4614793658256531,0.7805292010307312,0.11827442795038223,0.6399210095405579,0.14335328340530396,0.9446688890457153,0.5218483209609985,0.4146619439125061,0.26455560326576233,0.7742336988449097,0.4561503231525421,0.568433940410614,0.018789799883961678,0.6176354885101318,0.6120957136154175,0.6169340014457703,0.9437480568885803,0.681820273399353,0.35950788855552673,0.43703195452690125,0.6976311802864075,0.0602254718542099,0.6667667031288147,0.670637845993042,0.21038256585597992,0.12892629206180573,0.31542834639549255,0.36371076107025146,0.5701967477798462,0.4386015236377716,0.9883738160133362,0.10204481333494186,0.20887675881385803,0.16130951046943665,0.6531082987785339,0.25329160690307617,0.4663107693195343,0.24442559480667114,0.15896958112716675,0.11037514358758926,0.6563295722007751,0.13818295300006866,0.1965823620557785,0.3687251806259155,0.8209932446479797,0.09710127860307693,0.8379449248313904,0.0960984081029892,0.9764594435691833,0.4686512053012848,0.9767611026763916,0.6048455238342285,0.7392635941505432,0.03918779268860817,0.28280696272850037,0.12019655853509903,0.296140193939209,0.11872772127389908,0.3179831802845001,0.414262980222702,0.06414749473333359,0.6924721002578735,0.5666014552116394,0.26538950204849243,0.5232480764389038,0.09394051134586334,0.5759465098381042,0.9292961955070496,0.3185689449310303,0.6674103736877441,0.13179786503314972,0.7163271903991699,0.28940609097480774,0.18319135904312134,0.5865129232406616,0.02010754682123661,0.8289400339126587,0.004695476032793522,0.6778165102005005,0.2700079679489136,0.7351940274238586,0.9621885418891907,0.2487531453371048,0.5761573314666748,0.5920419096946716,0.5722519159317017,0.22308163344860077,0.9527490139007568,0.4471253752708435,0.8464086651802063,0.6994792819023132,0.2974369525909424,0.8137978315353394,0.396505743265152,0.8811032176017761,0.5812729001045227,0.8817353844642639,0.6925315856933594,0.7252542972564697,0.5013243556022644,0.9560836553573608,0.6439902186393738,0.4238550364971161,0.6063932180404663,0.019193198531866074,0.30157482624053955,0.6601735353469849,0.2900775969028473,0.6180154085159302,0.42876869440078735,0.1354740709066391,0.29828232526779175,0.5699648857116699,0.5908727645874023,0.5743252635002136,0.6532008051872253,0.6521032452583313,0.43141844868659973,0.8965466022491455,0.36756187677383423,0.4358649253845215,0.8919233679771423,0.806194007396698,0.7038885951042175,0.10022688657045364,0.9194825887680054,0.7142413258552551,0.9988470077514648,0.14944830536842346,0.8681260347366333,0.16249293088912964,0.6155595779418945,0.1238199844956398,0.8480082154273987,0.8073189854621887,0.5691007375717163,0.40718328952789307,0.06916699558496475,0.6974287629127502,0.45354267954826355,0.7220556139945984,0.8663823008537292,0.9755215048789978,0.855803370475769,0.011714084073901176,0.359978049993515,0.729990541934967,0.17162968218326569,0.5210366249084473,0.054337989538908005,0.19999653100967407,0.01852179504930973,0.793697714805603,0.2239246815443039,0.3453516662120819,0.9280812740325928,0.704414427280426,0.031838931143283844,0.1646941602230072,0.6214783787727356,0.5772286057472229,0.23789282143115997,0.9342139959335327,0.6139659285545349,0.5356327891349792,0.5899099707603455,0.7301220297813416,0.31194499135017395,0.39822107553482056,0.20984375476837158,0.18619300425052643,0.9443724155426025,0.739550769329071,0.49045881628990173,0.22741462290287018,0.2543564736843109,0.058029159903526306,0.43441662192344666,0.3117958903312683,0.6963434815406799,0.37775182723999023,0.1796036809682846,0.024678727611899376,0.06724963337182999,0.6793927550315857,0.4536968469619751,0.5365791916847229,0.8966712951660156,0.990338921546936,0.21689698100090027,0.6630781888961792,0.2633223831653595,0.02065099962055683,0.7583786249160767,0.32001715898513794,0.38346388936042786,0.5883170962333679,0.8310484290122986,0.6289818286895752,0.872650682926178,0.27354204654693604,0.7980468273162842,0.18563593924045563,0.9527916312217712,0.6874882578849792,0.21550767123699188,0.9473705887794495,0.7308558225631714,0.2539416551589966,0.21331197023391724,0.518200695514679,0.02566271834075451,0.20747007429599762,0.4246854782104492,0.3741699755191803,0.46357542276382446,0.27762871980667114,0.5867843627929688,0.8638556003570557,0.11753185838460922,0.517379105091095,0.13206811249256134,0.7168596982955933,0.39605969190597534,0.5654212832450867,0.1832798421382904,0.14484776556491852,0.4880562722682953,0.35561272501945496,0.9404319524765015,0.7653252482414246,0.748663604259491,0.9037197232246399,0.08342243731021881,0.5521924495697021,0.5844760537147522,0.961936354637146,0.29214751720428467,0.24082878232002258,0.10029394179582596,0.016429629176855087,0.9295293092727661,0.669916570186615,0.7851529121398926,0.28173011541366577,0.5864101648330688,0.06395526975393295,0.48562759160995483,0.9774951338768005,0.8765052556991577,0.3381589651107788,0.961570143699646,0.23170162737369537,0.9493188261985779,0.9413776993751526,0.799202561378479,0.6304479241371155,0.8742879629135132,0.2930202782154083,0.8489435315132141,0.6178767085075378,0.013236857950687408,0.34723350405693054,0.14814086258411407,0.9818294048309326,0.4783703088760376,0.49739137291908264,0.6394725441932678,0.36858460307121277,0.13690027594566345,0.8221177458763123,0.1898479163646698,0.5113189816474915,0.2243170291185379,0.09784448146820068,0.8621914982795715,0.9729194641113281,0.9608346819877625,0.9065554738044739,0.774047315120697,0.3331451416015625,0.08110138773918152,0.40724116563796997,0.2322341352701187,0.13248763978481293,0.053427182137966156,0.7255943417549133,0.011427458375692368,0.7705807685852051,0.14694663882255554,0.07952208071947098,0.08960303664207458,0.6720477938652039,0.24536721408367157,0.4205394685268402,0.557368814945221,0.8605511784553528,0.7270442843437195,0.2703278958797455,0.131482794880867,0.05537432059645653,0.3015986382961273,0.2621181607246399,0.45614057779312134,0.6832813620567322,0.6956254243850708,0.28351885080337524,0.3799269497394562,0.18115095794200897,0.7885454893112183,0.05684807524085045,0.6969972252845764,0.7786954045295715,0.7774075865745544,0.25942257046699524,0.3738131523132324,0.5875996351242065,0.27282190322875977,0.3708527982234955,0.19705428183078766,0.4598558843135834,0.044612299650907516,0.7997958660125732,0.07695644348859787,0.5188351273536682,0.3068101108074188,0.5775429606437683,0.9594333171844482,0.6455702185630798,0.03536243736743927,0.4304024279117584,0.5100168585777283,0.5361775159835815,0.6813924908638,0.2775960862636566,0.12886056303977966,0.3926756680011749,0.9564056992530823,0.1871308982372284,0.9039839506149292,0.5438059568405151,0.4569114148616791,0.8820413947105408,0.45860394835472107,0.7241676449775696,0.3990253210067749,0.9040443897247314,0.6900250315666199,0.6996220350265503,0.32772040367126465,0.7567786574363708,0.6360610723495483,0.2400202751159668,0.16053882241249084,0.796391487121582,0.9591665863990784,0.4581388235092163,0.5909841656684875,0.8577226400375366,0.45722344517707825,0.9518744945526123,0.5757511854171753,0.8207671046257019,0.9088436961174011,0.8155238032341003,0.15941447019577026,0.6288984417915344,0.39843425154685974,0.06271295249462128,0.4240322411060333,0.25868406891822815,0.849038302898407,0.03330462798476219,0.9589827060699463,0.35536885261535645,0.3567068874835968,0.01632850244641304,0.18523232638835907,0.40125951170921326,0.9292914271354675,0.0996149331331253,0.9453015327453613,0.869488537311554,0.4541623890399933,0.326700896024704,0.23274412751197815,0.6144647002220154,0.03307459130883217,0.015606064349412918,0.428795725107193,0.06807407736778259,0.2519409954547882,0.2211609184741974,0.253191202878952,0.13105523586273193,0.012036222964525223,0.11548429727554321,0.6184802651405334,0.9742562174797058,0.9903450012207031,0.40905410051345825,0.1629544198513031,0.6387617588043213,0.4903053343296051,0.9894098043441772,0.06530420482158661,0.7832344174385071,0.28839850425720215,0.24141861498355865,0.6625045537948608,0.24606318771839142,0.6658591032028198,0.5173085331916809,0.4240889847278595,0.5546877980232239,0.2870515286922455,0.7065746784210205,0.414856880903244,0.3605455458164215,0.8286569118499756,0.9249669313430786,0.04600730910897255,0.2326269894838333,0.34851935505867004,0.8149664998054504,0.9854914546012878,0.9689717292785645,0.904948353767395,0.2965562641620636,0.9920112490653992,0.24942004680633545,0.10590615123510361,0.9509525895118713,0.2334202527999878,0.6897682547569275,0.05835635960102081,0.7307090759277344,0.8817201852798462,0.27243688702583313,0.3790569007396698,0.3742961883544922,0.7487882375717163,0.2378072440624237,0.17185309529304504,0.4492916464805603,0.30446839332580566,0.8391891121864319,0.23774182796478271,0.5023894309997559,0.9425836205482483,0.6339976787567139,0.8672894239425659,0.940209686756134,0.7507648468017578,0.6995750665664673,0.9679655432701111,0.9944007992744446,0.4518216848373413,0.07086978107690811,0.29279401898384094,0.15235470235347748,0.41748636960983276,0.13128933310508728,0.6041178107261658,0.38280805945396423,0.8953858613967896,0.96779465675354,0.5468848943710327,0.2748235762119293,0.5922304391860962,0.8967611789703369,0.40673333406448364,0.5520782470703125,0.2716527581214905,0.4554441571235657,0.4017135500907898,0.24841345846652985,0.5058664083480835,0.31038081645965576,0.37303486466407776,0.5249704718589783,0.7505950331687927,0.3335074782371521,0.9241587519645691,0.8623185753822327,0.048690296709537506,0.2536425292491913,0.4461355209350586,0.10462789237499237,0.34847599267959595,0.7400975227355957,0.6805144548416138,0.6223844289779663,0.7105283737182617,0.20492368936538696,0.3416981101036072,0.676242470741272,0.879234790802002,0.5436780452728271,0.2826996445655823,0.030235258862376213,0.7103368043899536,0.007884103804826736,0.37267908453941345,0.5305371880531311,0.922111451625824,0.08949454873800278,0.40594232082366943,0.024313200265169144,0.3426109850406647,0.6222310662269592,0.2790679335594177,0.2097499519586563,0.11570323258638382,0.5771402716636658,0.6952700018882751,0.6719571352005005,0.9488610029220581,0.002703213831409812,0.6471966505050659,0.60039222240448,0.5887396335601807,0.9627703428268433,0.016871673986315727,0.6964824199676514,0.8136786222457886,0.5098071694374084,0.33396488428115845,0.7908401489257812,0.09724292904138565,0.44203564524650574,0.5199523568153381,0.6939564347267151,0.09088572859764099,0.2277594953775406,0.4103015661239624,0.6232946515083313,0.8869608044624329,0.618826150894165,0.13346147537231445,0.9805801510810852,0.8717857599258423,0.5027207732200623,0.9223479628562927,0.5413808226585388,0.9233060479164124,0.8298973441123962,0.968286395072937,0.919782817363739,0.03603381663560867,0.1747720092535019,0.3891346752643585,0.9521427154541016,0.300028920173645,0.16046763956546783,0.8863046765327454,0.4463944137096405,0.9078755974769592,0.16023047268390656,0.6611174941062927,0.4402637481689453,0.07648676633834839,0.6964631676673889,0.2473987489938736,0.03961552307009697,0.05994429811835289,0.06107853725552559,0.9077329635620117,0.7398838996887207,0.8980623483657837,0.6725823283195496,0.5289399027824402,0.30444636940956116,0.997962236404419,0.36218905448913574,0.47064894437789917,0.37824517488479614,0.979526937007904,0.1746583878993988,0.32798799872398376,0.6803486943244934,0.06320761889219284,0.60724937915802,0.47764649987220764,0.2839999794960022,0.2384132742881775,0.5145127177238464,0.36792758107185364,0.4565199017524719,0.3374773859977722,0.9704936742782593,0.13343943655490875,0.09680395573377609,0.3433917164802551,0.5910269021987915,0.6591764688491821,0.3972567617893219,0.9992780089378357,0.35189300775527954,0.7214066386222839,0.6375827193260193,0.8130538463592529,0.9762256741523743,0.8897936344146729,0.7645619511604309,0.6982485055923462,0.335498183965683,0.14768557250499725,0.06263600289821625,0.2419017106294632,0.432281494140625,0.521996259689331,0.7730835676193237,0.9587409496307373,0.1173204779624939,0.10700414329767227,0.5896947383880615,0.7453980445861816,0.848150372505188,0.9358320832252502,0.9834262132644653,0.39980170130729675,0.3803351819515228,0.14780867099761963,0.6849344372749329,0.6567619442939758,0.8620625734329224,0.09725799411535263,0.49777689576148987,0.5810819268226624,0.2415570467710495,0.16902540624141693,0.8595808148384094,0.05853492394089699,0.47062090039253235,0.11583399772644043,0.45705875754356384,0.9799623489379883,0.4237063527107239,0.857124924659729,0.11731556057929993,0.2712520658969879,0.40379273891448975,0.39981213212013245,0.6713835000991821,0.3447181284427643,0.713766872882843,0.6391869187355042,0.399161159992218,0.43176013231277466,0.614527702331543,0.0700421929359436,0.8224067091941833,0.65342116355896,0.7263424396514893,0.5369229912757874,0.11047711223363876,0.4050356149673462,0.40537357330322266,0.3210429847240448,0.029950324445962906,0.73725426197052,0.10978446155786514,0.6063081622123718,0.7032175064086914,0.6347863078117371,0.95914226770401,0.10329815745353699,0.8671671748161316,0.02919023483991623,0.534916877746582,0.4042436182498932,0.5241838693618774,0.36509987711906433,0.19056691229343414,0.01912289671599865,0.5181497931480408,0.8427768349647522,0.3732159435749054,0.2228638231754303,0.080532006919384,0.0853109210729599,0.22139644622802734,0.10001406073570251,0.26503971219062805,0.06614946573972702,0.06560486555099487,0.8562761545181274,0.1621202677488327,0.5596824288368225,0.7734555602073669,0.4564095735549927,0.15336887538433075,0.19959613680839539,0.43298420310020447,0.52823406457901,0.3494403064250946,0.7814795970916748,0.7510216236114502,0.9272118210792542,0.028952548280358315,0.8956912755966187,0.39256879687309265,0.8783724904060364,0.690784752368927,0.987348735332489,0.7592824697494507,0.3645446300506592,0.5010631680488586,0.37638914585113525,0.364911824464798,0.2609044909477234,0.49597030878067017,0.6817399263381958,0.27734026312828064,0.5243797898292542,0.117380291223526,0.1598452925682068,0.04680635407567024,0.9707314372062683,0.0038603513967245817,0.17857997119426727,0.6128667593002319,0.08136960119009018,0.8818964958190918,0.7196201682090759,0.9663899540901184,0.5076355338096619,0.3004036843776703,0.549500584602356,0.9308187365531921,0.5207614302635193,0.2672070264816284,0.8773987889289856,0.3719187378883362,0.0013833499979227781,0.2476850152015686,0.31823351979255676,0.8587774634361267,0.4585031569004059,0.4445872902870178,0.33610227704048157,0.880678117275238,0.9450267553329468,0.9918903112411499,0.3767412602901459,0.9661474227905273,0.7918795943260193,0.675689160823822,0.24488948285579681,0.21645726263523102,0.1660478264093399,0.9227566123008728,0.2940766513347626,0.4530942440032959,0.49395784735679626,0.7781715989112854,0.8442349433898926,0.1390727013349533,0.4269043505191803,0.842854917049408,0.8180332779884338}; float calib_output0_data[NET_OUTPUT0_SIZE] = {3.5647096e-05,6.824297e-08,0.009327697,3.2340475e-05,1.1117579e-05,1.5117058e-06,4.6314454e-07,5.161628e-11,0.9905911,3.8835238e-10};
编译工程及烧录
编译
在MCU工程目录下,执行
make
命令进行编译,编译成功后显示如下,test_stm767为本例的MCU工程名:arm-none-eabi-size build/test_stm767.elf text data bss dec hex filename 120316 3620 87885 211821 33b6d build/test_stm767.elf arm-none-eabi-objcopy -O ihex build/test_stm767.elf build/test_stm767.hex arm-none-eabi-objcopy -O binary -S build/test_stm767.elf build/test_stm767.bin
烧录运行
我们可以通过
STMSTM32CubePrg
烧录工具进行代码烧录并运行。在PC机上,通过STLink
连接一块可烧录的开发板,然后在当前MCU工程目录下运行以下命令,执行烧录并运行程序:bash ${STMSTM32CubePrg_PATH}/bin/STM32_Programmer.sh -c port=SWD -w build/test_stm767.bin 0x08000000 -s 0x08000000
${STMSTM32CubePrg_PATH为}为
STMSTM32CubePrg
安装路径。关于命令中的各参数含义,请参考STMSTM32CubePrg
的使用手册。
推理结果验证
本例中,我们把benchmark运行结果标志保存在了起始地址为0x20000000
且大小为1字节的内存段内,故可以直接通过烧录器获取该处地址的数据,以得到程序返回结果。
在PC机上,通过STLink
连接一块已烧录好程序的开发板,通过执行以下命令读取内存数据:
bash ${STMSTM32CubePrg_PATH为}/bin/STM32_Programmer.sh -c port=SWD model=HOTPLUG --upload 0x20000000 0x1 ret.bin
${STMSTM32CubePrg_PATH为}为STMSTM32CubePrg
安装路径。关于命令中的各参数含义,请参考STMSTM32CubePrg
的使用手册。
读取的数据被保存在ret.bin
文件内,运行cat ret.bin
,如果板端推理成功,ret.bin
内保存着字符1
,会显示如下:
1
在轻鸿蒙设备上执行推理
轻鸿蒙编译环境准备
用户可以通过OpenHarmony官网来学习轻鸿蒙环境下的编译及烧录。 本教程以Hi3516开发板为例,演示如何在轻鸿蒙环境上使用Micro部署推理模型。
编译模型
使用converter_lite编译lenet模型,生成对应轻鸿蒙平台的推理代码,命令如下:
./converter_lite --fmk=TFLITE --modelFile=mnist.tflite --outputFile=${SOURCE_CODE_DIR} --configFile=${COFIG_FILE}
其中config配置文件设置target = ARM32。
编写构建脚本
轻鸿蒙应用程序开发请先参考运行Hello OHOS。将上一步生成的mnist目录拷贝到任意鸿蒙源码路径下,假设为applications/sample/,然后新建BUILD.gn文件:
<harmony-source-path>/applications/sample/mnist
├── benchmark
├── CMakeLists.txt
├── BUILD.gn
└── src
下载适用于OpenHarmony的预编译推理runtime包,然后将其解压至任意鸿蒙源码路径下。编写BUILD.gn文件:
import("//build/lite/config/component/lite_component.gni")
import("//build/lite/ndk/ndk.gni")
lite_component("mnist_benchmark") {
target_type = "executable"
sources = [
"benchmark/benchmark.cc",
"benchmark/calib_output.cc",
"benchmark/load_input.c",
"src/net.c",
"src/weight.c",
"src/session.cc",
"src/tensor.cc",
]
features = []
include_dirs = [
"<YOUR MINDSPORE LITE RUNTIME PATH>/runtime",
"<YOUR MINDSPORE LITE RUNTIME PATH>/tools/codegen/include",
"//applications/sample/mnist/benchmark",
"//applications/sample/mnist/src",
]
ldflags = [
"-fno-strict-aliasing",
"-Wall",
"-pedantic",
"-std=gnu99",
]
libs = [
"<YOUR MINDSPORE LITE RUNTIME PATH>/runtime/lib/libmindspore-lite.a",
"<YOUR MINDSPORE LITE RUNTIME PATH>/tools/codegen/lib/libwrapper.a",
]
defines = [
"NOT_USE_STL",
"ENABLE_NEON",
"ENABLE_ARM",
"ENABLE_ARM32"
]
cflags = [
"-fno-strict-aliasing",
"-Wall",
"-pedantic",
"-std=gnu99",
]
cflags_cc = [
"-fno-strict-aliasing",
"-Wall",
"-pedantic",
"-std=c++17",
]
}
<YOUR MINDSPORE LITE RUNTIME PATH>
是解压出来的推理runtime包路径,比如//applications/sample/mnist/mindspore-lite-1.3.0-ohos-aarch32。
修改文件build/lite/components/applications.json,添加组件mnist_benchmark的配置:
{
"component": "mnist_benchmark",
"description": "Communication related samples.",
"optional": "true",
"dirs": [
"applications/sample/mnist"
],
"targets": [
"//applications/sample/mnist:mnist_benchmark"
],
"rom": "",
"ram": "",
"output": [],
"adapted_kernel": [ "liteos_a" ],
"features": [],
"deps": {
"components": [],
"third_party": []
}
},
修改文件vendor/hisilicon/hispark_taurus/config.json,新增mnist_benchmark组件的条目:
{ "component": "mnist_benchmark", "features":[] }
编译benchmark
cd <openharmony-source-path>
hb set(设置编译路径)
.(选择当前路径)
选择ipcamera_hispark_taurus@hisilicon并回车
hb build mnist_benchmark(执行编译)
生成结果文件out/hispark_taurus/ipcamera_hispark_taurus/bin/mnist_benchmark。
执行benchmark
将mnist_benchmark、权重文件(mnist/src/net.bin)以及输入文件解压后拷贝到开发板上,然后执行:
OHOS # ./mnist_benchmark mnist_input.bin net.bin 1
OHOS # =======run benchmark======
input 0: mnist_input.bin
loop count: 1
total time: 10.11800ms, per time: 10.11800ms
outputs:
name: int8toft32_Softmax-7_post0/output-0, DataType: 43, Elements: 10, Shape: [1 10 ], Data:
0.000000, 0.000000, 0.003906, 0.000000, 0.000000, 0.992188, 0.000000, 0.000000, 0.000000, 0.000000,
========run success=======
自定义算子
使用前请先参考自定义算子了解基本概念。Micro目前仅支持custom类型的自定义算子注册和实现,暂不支持内建算子(比如conv2d、fc等)的注册和自定义实现。下面以海思Hi3516D开发板为例,说明如何在Micro中使用自定义算子。
模型生成代码方式与非自定义算子模型保持一致:
./converter_lite --fmk=TFLITE --modelFile=mnist.tflite --outputFile=${SOURCE_CODE_DIR} --configFile=${COFIG_FILE}
其中config配置文件设置target = ARM32。
用户实现自定义算子
上一步会在用户指定路径下生成源码目录,其有一个名为src/registered_kernel.h
的头文件指定了custom算子的函数声明:
int CustomKernel(TensorC *inputs, int input_num, TensorC *outputs, int output_num, CustomParameter *param);
用户需要提供该函数的实现,并将相关源码或者库集成到生成代码的cmake工程中。例如,我们提供了支持海思NNIE的custom kernel示例动态库libmicro_nnie.so,该文件包含在官网下载页“NNIE 推理runtime库、benchmark工具”组件中。用户需要修改生成代码的CMakeLists.txt,添加链接的库名称和路径。例如:
link_directories(<YOUR_PATH>/mindspore-lite-1.8.1-linux-aarch32/providers/Hi3516D)
link_directories(<HI3516D_SDK_PATH>)
target_link_libraries(benchmark net micro_nnie nnie mpi VoiceEngine upvqe dnvqe securec -lm -pthread)
在生成的benchmark/benchmark.c
文件中,在main函数的调用前后添加NNIE设备相关初始化代码,最后进行源码编译:
mkdir buid && cd build
cmake -DCMAKE_TOOLCHAIN_FILE=<MS_SRC_PATH>/mindspore/lite/cmake/himix200.toolchain.cmake -DPLATFORM_ARM32=ON -DPKG_PATH=<RUNTIME_PKG_PATH> ..
make
Micro推理与端侧训练结合
概述
除MCU外,Micro推理是一种模型结构与权重分离的推理模式。训练一般是改变了权重,但不会改变模型结构。那么,在训练与推理配合的场景下,可以采用端侧训练+Micro推理的模式,以利用Micro推理运行内存小、功耗小的优势。具体过程包括以下几步:
基于端侧训练导出推理模型
通过converter_lite转换工具,生成与端侧训练相同架构下的模型推理代码
下载得到与端侧训练相同架构对应的
Micro
库对得到的推理代码和
Micro
库进行集成,编译并部署基于端侧训练导出推理模型的权重,覆盖原有权重文件,进行验证
接下来我们将详细介绍各个步骤及其注意事项。
训练导出推理模型
用户可以直接参考端侧训练一节。
生成推理代码
用户可以直接参考上述内容,但需要注意两个点。第一,训练导出的模型是ms模型,因此在转换时,需设置fmk
为MSLITE
;第二,为了能够将训练与Micro推理结合,就需要保证训练导出的权重和Micro导出的权重完全匹配,因此,我们在Micro配置参数中新增了两个属性,以保证权重的一致性。
[micro_param]
# false indicates that only the required weights will be saved. Default is false.
# If collaborate with lite-train, the parameter must be true.
keep_original_weight=false
# the names of those weight-tensors whose shape is changeable, only embedding-table supports change now.
# the parameter is used to collaborate with lite-train. If set, `keep_original_weight` must be true.
changeable_weights_name=name0,name1
keep_original_weight
是保证权重一致性的关键属性,与训练配合时,此属性必须为true。changeable_weights_name
是针对特殊场景下的属性,例如某些权重的shape发生了变化,当然,当前仅支持embedding表的个数发生变化,一般而言,用户无需设置该属性。
编译部署
用户可以直接参考上述内容。
训练导出推理模型的权重
MindSpore的Serialization类提供了ExportWeightsCollaborateWithMicro函数,ExportWeightsCollaborateWithMicro原型如下:
static Status ExportWeightsCollaborateWithMicro(const Model &model, ModelType model_type,
const std::string &weight_file, bool is_inference = true,
bool enable_fp16 = false,
const std::vector<std::string> &changeable_weights_name = {});
其中,is_inference
当前仅支持为true。