Using Java Interface to Perform Training
Overview
The principal procedures of lite training is as follows:
Design the network and export the
MindIR
model file by using the cloud side APIs.Transfer the
MindIR
file to .ms model file.Train, evaluate and save
ms
model files.
The model structure is saved in the transferred
ms
model file which will be load to the device platform for training.
The following figure shows the detailed training process:
For more javaAPI description, please refer to API Documentation.
Model Creating, Loading and Building
Model is the main entrance of the MindSpore Lite framework. We can compile and execute graph models through Model
class.
Reading Models
A Model file is flatbuffer-serialized file which was converted using the MindSpore Model Converter Tool. These files have a .ms
extension. Before model training and/or inference, the model needs to be loaded from the file system and parsed. Related operations are mainly implemented in the [Graph]((https://www.mindspore.cn/lite/api/en/r1.9/api_java/graph.html#graph) class which holds the model data such as the network structure, weights data and operators attributes.
Creating Contexts
MSContext is a MindSpore Lite Object which contains basic configuration parameters required by the sessions to guide graph compilation and execution. It allows to define the device to run the model, e.g., CPU or GPU, the number of threads used for training and inference and the memory allocation scheme. Currently, only CPU device is supported in training.
Creating TrainLoop
Users can create the object of the class Model
by using the function Build
to call MindData APIs. The member function Build
of the class Model
, its prototype is as follows:
public boolean build(Graph graph, MSContext context, TrainCfg cfg);
The following codes show how to create a training session based on the multi-threads CPU by using the class Model
.
Graph g = new Graph();
MSContext context = new MSContext();
context.init(1, 0);
context.addDeviceInfo(DeviceType.DT_CPU, false, 0);
TrainCfg cfg = new TrainCfg();
cfg.init();
Model liteModel = new Model();
liteModel.build(g, context, cfg);
Refer Train a LeNet for more details.
Data Processing
Data Reading Pipeline
Currently, java does not provide data processing API such as C++ Dataset
class and its extended classes. Users need to define the data preprocessing process by themselves. After processing the image or text data into byte data, copy it to the input of the model.
Example
The following codes shows the Mnist data reading and data preprocessing process:
Currently, java does not provide data processing API such as C++ Dataset
class and its extended classes. Users need to define the data preprocessing process by themselves. After processing the image or text data into byte data, copy it to the input of the model.
public void readMNISTFile(String inputFileName, String labelFileName, Vector<DataLabelTuple> dataset) {
try {
BufferedInputStream ibin = new BufferedInputStream(new FileInputStream(inputFileName));
BufferedInputStream lbin = new BufferedInputStream(new FileInputStream(labelFileName));
byte[] bytes = new byte[4];
...
// read images
int image_size = n_rows * n_cols;
byte[] image_data = new byte[image_size];
for (int i = 0; i < lnumber; i++) {
byte [] hwc_bin_image = new byte[32 * 32];
readFile(ibin, image_data, image_size);
for (int r = 0; r < 32; r++) {
for (int c = 0; c < 32; c++) {
int index = r * 32 + c;
if (r < 2 || r > 29 || c < 2 || c > 29) {
hwc_bin_image[index] = 0;
} else {
hwc_bin_image[index] = image_data[(r-2)*28 + (c-2)];
}
}
}
DataLabelTuple data_label_tupel = new DataLabelTuple();
data_label_tupel.data = hwc_bin_image;
data_label_tupel.label = labels[i];
dataset.add(data_label_tupel);
}
} catch (IOException e) {
System.err.println("Read Dateset exception");
}
}
Refer Train a LeNet for more details.
Execute Training
MindSpore Lite java interface can obtain the output of the model through the interface provided by the Model
class. In the training mode, the output of the model is loss, and in the inference mode, the output of the model is the predicted value. The training and inference modes can be switched through the setTrainMode
interface. Execute the model through the runStep
interface.
Training
Create the objects of the off-the-shelf functions and call the Train function of the class Model to train:
model.setTrainMode(true);
for (int i = 0; i < cycles; i++) {
for (int b = 0; b < virtualBatch; b++) {
fillInputData(ds.getTrainData(), false);
isSuccess = model.runStep();
if (!isSuccess) {
model.free();
System.err.println("run step failed");
return -1;
}
float loss = getLoss();
if (min_loss > loss) {
min_loss = loss;
}
if ((b == 0) && ((i + 1) % 500 == 0)) {
float acc = calculateAccuracy(10); // only test 10 batch size
if (max_acc < acc) {
max_acc = acc;
}
System.out.println("step_" + (i + 1) + ": \tLoss is " + loss + " [min=" + min_loss + "]" + " " +
"max_acc=" + max_acc);
}
}
}
Refer Train a LeNet for more details.
Evaluating
Similarly, switch to inference mode through the setTrainMode
interface:
model.setTrainMode(false);
With TrainSessions, a network can be used for both inference and training. These two modes differ in several aspects:
The input of the network: Running inference requires only the data, while running training requires both data and labels.
The output of the network: Running inference returns the predicted values in the output, while running in training mode returns the loss.
In training mode, the weights of the layers are updated in each Run, while in inference mode they are static.
Some layers behave differently in inference vs. training mode, e.g., updating the accumulated batch mean and variance in Batch Normalization layers.
Others
Model Mode Switching
The function prototype of setTrainMode
in the Model
class is as follows:
public boolean setTrainMode(boolean isTrain)
Resizing the Input Dimension
When MindSpore Lite is used for inference, if the input shape needs to be resized, you can call the Resize API of Model to resize the shape of the input tensor after a model is created and built.
Some networks do not support variable dimensions. As a result, an error message is displayed and the model exits unexpectedly. For example, the model contains the MatMul operator, one input tensor of the MatMul operator is the weight, and the other input tensor is the input. If a variable dimension API is called, the input tensor does not match the shape of the weight tensor. As a result, the training fails.
The following sample code demonstrates how to perform Resize on the input tensor of MindSpore Lite:
List<MSTensor> inputs = session.getInputs();
int[][] dims = {{1, 300, 300, 3}};
bool ret = model.resize(inputs, dims);
Obtaining Input Tensors
Before graph execution, whether it is during training or inference, the input data must be filled-in into the model input tensors. MindSpore Lite provides the following methods to obtain model input tensors:
Use the
getInputByTensorName
method to obtain model input tensors that are connected to the model input node based on the tensor name./** * Get input tensor by tensor name. * * @param tensorName name. * @return input tensor. */ public MSTensor getInputByTensorName(String tensorName);
Use the
getInputs
method to directly obtain the vectors of all model input tensors./** * Get model inputs tensor. * * @return input tensors. */ public List<MSTensor> getInputs();
If the model requires more than one input tensor (this is certainly the case during training, where both data and labels serve as inputs of the network) it is the user’s responsibility to know the inputs order or their tensorName. This can be obtained from the Python model. Alternatively, one can deduce this information from the sizes of the input tensors.
Copying Data
After model input tensors are obtained, the data must be copied into the tensors. The following methods allows to access the size of the data, the number of elements, the data type and the writable pointer. See also detailed description in the MSTensor API documentation.
// Assuming model is a valid instance of Model List<MSTensor> inputs = model.getInputs(); imageTensor = inputs.get(dataIndex); imageInputBuf = ByteBuffer.allocateDirect((int) imageTensor.size()); imageInputBuf.order(ByteOrder.nativeOrder());
Obtaining Output Tensors
After each execution of the graph, the user might want to read the model’s outputs, whether it is the loss in the case of training mode, or the predicted output in the case of evaluation mode.
MindSpore Lite provides the following methods to obtain the model’s output MSTensor
.
Use the
getOutputsByNodeName
method to obtain the output tensors that belong to a certain node:/** * Get output tensors by node name. * * @param nodeName output node name * @return output tensor */ public List<MSTensor> getOutputsByNodeName(String nodeName);
Use the
getOutputByTensorName
method to obtain an output tensor, based on the tensor name./** * Get output tensor names. * * @return output tensor name list. */ public MSTensor getOutputByTensorName(String tensorName);
Use the
getOutputs
method to obtain all the output tensors, ordered by their tensor names./** * Get model outputs. * * @return model outputs tensor. */ public List<MSTensor> getOutputs();
Saving Model
MindSpore Lite provides the export
interface to save the model, the prototype is as follows:
/**
* Export the model.
*
* @param fileName Name Model file name.
* @param quantizationType The quant type.0,no_quant,1,weight_quant,2,full_quant.
* @param isOnlyExportInfer if export only inferece.
* @param outputTensorNames tensor name used for export inference graph.
* @return Whether the export is successful.
*/
public boolean export(String fileName, int quantizationType, boolean isOnlyExportInfer,List<String> outputTensorNames);
You can load the saved model to perform re-training or inference.