Offline Conversion of Inference Models

Overview

MindSpore Lite cloud-side inference provides tools for offline conversion of models, and supports many types of model conversions, and the converted models can be used for inference. Command line parameters include a variety of personalized options to provide users with convenient conversion paths.

The currently supported input formats are MindSpore, TensorFlow Lite, Caffe, TensorFlow, and ONNX.

The mindir model converted by the converter supports the converter companion and higher versions of the Runtime inference framework to perform inference.

Note: Due to interface compatibility issues, the conversion tool cannot be run in CANN packages below version 7.5 due to interface compatibility issues. The version number of the CANN package is the content in the latest/version.cfg directory of the CANN package installation directory.

Linux Environment Usage Instructions

Environment Preparation

To use MindSpore Lite cloud-side inference model converter, the following environment preparation is required.

Compile or download model converter.
Add the dynamic link libraries required by the converter to the environment variable LD_LIBRARY_PATH.
```
export LD_LIBRARY_PATH=${PACKAGE_ROOT_PATH}/tools/converter/lib:${LD_LIBRARY_PATH}
```
${PACKAGE_ROOT_PATH} is the path of the compiled or downloaded package after unpacking.

Directory Structure

mindspore-lite-{version}-linux-x64
└── tools
    └── converter
        ├── include                            # Custom operators, model parsing, node parsing, conversion optimization registration headers
        ├── converter                          # Model converter
        │   └── converter_lite                 # Executable programs
        └── lib                                # Dynamic libraries that the converter depends on
            ├── libmindspore_glog.so.0         # Glog dynamic libraries
            ├── libascend_pass_plugin.so       # Register for Ascend Backend Graph Optimization Plugin Dynamic Library
            ├── libmslite_shared_lib.so        # Adaptation of the dynamic library in the backend of Ascend
            ├── libmindspore_converter.so      # Dynamic library for model conversion
            ├── libmslite_converter_plugin.so  # Model conversion plugin
            ├── libmindspore_core.so           # MindSpore Core dynamic libraries
            ├── libopencv_core.so.4.5          # Dynamic libraries for OpenCV
            ├── libopencv_imgcodecs.so.4.5     # Dynamic libraries for OpenCV
            └── libopencv_imgproc.so.4.5       # Dynamic libraries for OpenCV
        ├── third_party                        # Third party model proto definition

Description of Parameters

MindSpore Lite cloud-side inference model converter provides various parameter settings that users can choose to use according to their needs. In addition, users can enter ./converter_lite --help for live help.

Detailed parameter descriptions are provided below.

Parameters	Required or not	Description of parameters	Value range	Default values	Remarks
`--help`	Not	Print all help information.	-	-	-
`--fmk=<FMK>`	Required	The original format of the input model.	MINDIR, CAFFE, TFLITE, TF, ONNX	-	-
`--modelFile=<MODELFILE>`	Required	The path of the input model.	-	-	-
`--outputFile=<OUTPUTFILE>`	Required	The path to the output model, without the suffix, can be automatically generated with the `.mindir` suffix.	-	-	-
`--weightFile=<WEIGHTFILE>`	Required when converting Caffe models	The path to the input model weight file.	-	-	-
`--configFile=<CONFIGFILE>`	Not	1. can be used as a post-training quantization profile path; 2. can be used as an extended function profile path.	-	-	-
`--inputShape=<INPUTSHAPE>`	Not	Set the dimensions of the model inputs, and keep the order of the input dimensions the same as the original model. The model structure can be further optimized for some specific models, but the converted model will probably lose the dynamic shape properties. Multiple inputs are split by `;`, along with double quotes `""`.	e.g. "inTensorName_1: 1,32,32,4;inTensorName_2:1,64,64,4;"	-	-
`--saveType=<SAVETYPE>`	Not	Set the exported model as `mindir` model or `ms` model.	MINDIR, MINDIR_LITE	MINDIR	This version can only be reasoned with models turned out by setting to MINDIR
`--optimize=<OPTIMIZE>`	Not	Set the optimization accomplished in the process of converting model.	none, general, gpu_oriented, ascend_oriented	general	-
`--decryptKey=<DECRYPTKEY>`	Not	Set the key used to load the cipher text MindIR. The key is expressed in hexadecimal and is only valid when `fmk` is MINDIR.	-	-	-
`--decryptMode=<DECRYPTMODE>`	Not	Set the mode to load the cipher MindIR, valid only when decryptKey is specified.	AES-GCM, AES-CBC	AES-GCM	-
`--encryptKey=<ENCRYPTKEY>`	Not	Set the key to export the encryption `mindir` model. The key is expressed in hexadecimal. Only AES-GCM is supported, and the key length is only 16Byte.	-	-	-
`--encryption=<ENCRYPTION>`	Not	Set whether to encrypt when exporting `mindir` models. Export encryption protects model integrity, but increases runtime initialization time.	true, false	true	-
`--infer=<INFER>`	Not	Set whether to perform pre-inference when the conversion is completed.	true, false	false	-
`--inputDataFormat=<INPUTDATAFORMAT>`	Not	Set the input format of the exported model, valid only for 4-dimensional inputs.	NHWC, NCHW	-	-
`--fp16=<FP16>`	Not	Set whether the weights in float32 data format need to be stored in float16 data format during model serialization.	on, off	off	Not supported at the moment
`--inputDataType=<INPUTDATATYPE>`	Not	Set the data type of the quantized model input tensor. Only if the quantization parameters (scale and zero point) of the model input tensor are available. The default is to keep the same data type as the original model input tensor.	FLOAT32, INT8, UINT8, DEFAULT	DEFAULT	Not supported at the moment
`--outputDataType=<OUTPUTDATATYPE>`	Not	Set the data type of the quantized model output tensor. Only if the quantization parameters (scale and zero point) of the model output tensor are available. The default is to keep the same data type as the original model output tensor.	FLOAT32, INT8, UINT8, DEFAULT	DEFAULT	Not supported at the moment
`--device=<DEVICE>`	Not	Set target device when converter model. The use case is when on the Ascend device, if you need to the converted model to have the ability to use Ascend backend to perform inference, you can set the parameter. If it is not set, the converted model will use CPU backend to perform inference by default.	This option will be deprecated. It is replaced by setting `optimize` option to `ascend_oriented`
`--optimizeTransformer=<OPTIMIZETRANSFORMER>`	Not	Set whether to do transformer-fursion or not.	true, false	false	only support tensorrt

Notes:

The parameter name and the parameter value are connected by an equal sign without any space between them.
Caffe models are generally divided into two files: *.prototxt model structure, corresponding to the --modelFile parameter, and *.caffemodel model weights, corresponding to the --weightFile parameter.
The configFile configuration file uses the key=value approach to define the relevant parameters.
--optimize parameter is used to set the mode of optimization during the offline conversion. If this parameter is set to none, no relevant graph optimization operations will be performed during the offline conversion phase of the model, and the relevant graph optimization operations will be done during the execution of the inference phase. The advantage of this parameter is that the converted model can be deployed directly to any CPU/GPU/Ascend hardware backend since it is not optimized in a specific way, while the disadvantage is that the initialization time of the model increases during inference execution. If this parameter is set to general, general optimization will be performed, such as constant folding and operator fusion (the converted model only supports CPU/GPU hardware backend, not Ascend backend). If this parameter is set to gpu_oriented, the general optimization and extra optimization for GPU hardware will be performed (the converted model only supports GPU hardware backend). If this parameter is set to ascend_oriented, the optimization for Ascend hardware will be performed (the converted model only supports Ascend hardware backend).
The encryption and decryption function only takes effect when MSLITE_ENABLE_MODEL_ENCRYPTION=on is set at compile time and only supports Linux x86 platforms. decrypt_key and encrypt_key are string expressed in hexadecimal. Linux platform users can use the' xxd 'tool to convert the key expressed in bytes into hexadecimal expressions.
For the MindSpore model, since it is already a mindir model, two approaches are suggested:

Inference is performed directly without offline conversion.

When using offline conversion, setting --optimize to general in CPU/GPU hardware backend (for general optimization), setting --optimize to gpu_oriented in GPU hardware (for GPU extra optimization based on general optimization), setting --optimize to ascend_oriented in Ascend hardware. The relevant optimization is done in the offline phase to reduce the initialization time of inference execution.

Usage Examples

The following selects common examples to illustrate the use of the conversion command.

Take the Caffe model LeNet as an example and execute the conversion command.
```
./converter_lite --fmk=CAFFE --saveType=MINDIR --optimize=none --modelFile=lenet.prototxt --weightFile=lenet.caffemodel --outputFile=lenet
```
In this example, because the Caffe model is used, two input files, model structure and model weights, are required. Together with the other two required parameters, fmk type and output path, it can be executed successfully.

The result is shown as:
```
CONVERT RESULT SUCCESS:0
```
This indicates that the Caffe model has been successfully converted into a MindSpore Lite cloud-side inference model, obtaining the new file lenet.mindir.

Take MindSpore, TensorFlow Lite, TensorFlow and ONNX models as examples and execute the conversion command.

MindSpore model model.mindir

./converter_lite --fmk=MINDIR --saveType=MINDIR --optimize=general --modelFile=model.mindir --outputFile=model

TensorFlow Lite model model.tflite

./converter_lite --fmk=TFLITE --saveType=MINDIR --optimize=none --modelFile=model.tflite --outputFile=model

TensorFlow model model.pb

./converter_lite --fmk=TF --saveType=MINDIR --optimize=none --modelFile=model.pb --outputFile=model

ONNX model model.onnx

./converter_lite --fmk=ONNX --saveType=MINDIR --optimize=none --modelFile=model.onnx --outputFile=model

In all of the above cases, the following conversion success message is displayed and the model.mindir target file is obtained at the same time.

CONVERT RESULT SUCCESS:0