Overall Architecture (Lite)

MindSpore Lite is an ultra-fast, intelligent, and simplified AI engine that enables intelligent applications in all scenarios, provides E2E solutions for users, and helps users enable AI capabilities.

MindSpore Lite is divided into two parts: offline module and online module. The overall architecture of MindSpore Lite is as follows:

architecture

Offline module:
- 3rd Model Parsers: converts third-party models to a unified MindIR. Third-party models include TensorFlow, TensorFlow Lite, Caffe 1.0, and ONNX models.
- MindIR: MindSpore device-cloud unified IR.
- Optimizer: optimizes graphs based on IR, such as operator fusion and constant folding.
- Quantizer: quantization module after training. Quantizer supports quantization methods after training, such as weight quantization and activation value quantization.
- benchmark: a tool set for testing performance and debugging accuracy.
- Micro CodeGen: a tool to directly compile models into executable files for IoT scenarios.
Online module:
- Training/Inference APIs: the unified C++/Java training inference interface for the device and cloud.
- MindRT Lite: lightweight online runtime, it supports asynchronous execution.
- MindData Lite: used for the device-side data processing.
- Delegate: agent for docking professional AI hardware engine.
- Kernels: the built-in high-performance operator library which provides CPU, GPU and NPU operators.
- Learning Strategies: device-side learning strategies, such as transfer learning.