Inference Execution

Translator: Dongdong92

For trained models, MindSpore can execute inference tasks on different hardware platforms. MindSpore also provides online inference services based on MindSpore Serving.

Inference Service Based on Models

Overview

MindSpore supports to save training parameters files as CheckPoint format. MindSpore also supports to save network model files as MindIR, AIR, and ONNX.

Referring to the executing inference section, users not only can execute local inference through mindspore.model.predict interface, but also can export MindIR, AIR, and ONNX model files through mindspore.export for inference on different hardware platforms.

For dominating the difference between backend models, model files in the MindIR format is recommended. MindIR model files can be executed on different hardware platforms, as well as be deployed to the Serving platform on cloud and the Lite platform on device.

Executing Inference on Different Platforms

For the Ascend hardware platform, please refer to Inference on the Ascend 910 AI processor and Inference on Ascend 310.
For the GPU/CPU hardware platform, please refer to Inference on a GPU/CPU.
For inference on the Lite platform on device, please refer to on-device inference.

Please refer to MindSpore C++ Library Use to solve the interface issues on the Ascend hardware platform.

On-line Inference Service Deployment Based on MindSpore Serving

MindSpore Serving is a lite and high-performance service module, aiming at assisting MindSpore developers in efficiently deploying on-line inference services in the production environment. When a user completes the training task by using MindSpore, the trained model can be exported for inference service deployment via MindSpore Serving. Please refer to the following examples for deployment:

For deployment issues regarding the on-line inference service, please refer to MindSpore Serving Class.