{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Model基本使用\n", "\n", "[![下载Notebook](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/advanced/train/mindspore_model.ipynb) [![下载样例代码](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_download_code.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/advanced/train/mindspore_model.py) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.8/tutorials/source_zh_cn/advanced/train/model.ipynb)\n", "\n", "通常情况下,定义训练和评估网络并直接运行,已经可以满足基本需求。\n", "\n", "一方面,`Model`可以在一定程度上简化代码。例如:无需手动遍历数据集;在不需要自定义`nn.TrainOneStepCell`的场景下,可以借助`Model`自动构建训练网络;可以使用`Model`的`eval`接口进行模型评估,直接输出评估结果,无需手动调用评价指标的`clear`、`update`、`eval`函数等。\n", "\n", "另一方面,`Model`提供了很多高阶功能,如数据下沉、混合精度等,在不借助`Model`的情况下,使用这些功能需要花费较多的时间仿照`Model`进行自定义。\n", "\n", "本文档首先对MindSpore的Model进行基本介绍,然后重点讲解如何使用`Model`进行模型训练、评估和推理。\n", "\n", "![model](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/tutorials/source_zh_cn/advanced/train/images/model.png)\n", "\n", "## Model基本介绍\n", "\n", "[Model](https://mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore/mindspore.Model.html#mindspore.Model)是MindSpore提供的高阶API,可以进行模型训练、评估和推理。其接口的常用参数如下:\n", "\n", "- `network`:用于训练或推理的神经网络。\n", "- `loss_fn`:所使用的损失函数。\n", "- `optimizer`:所使用的优化器。\n", "- `metrics`:用于模型评估的评价函数。\n", "- `eval_network`:模型评估所使用的网络,未定义情况下,`Model`会使用`network`和`loss_fn`进行封装。\n", "\n", "`Model`提供了以下接口用于模型训练、评估和推理:\n", "\n", "- `train`:用于在训练集上进行模型训练。\n", "- `eval`:用于在验证集上进行模型评估。\n", "- `predict`:用于对输入的一组数据进行推理,输出预测结果。\n", "\n", "### 使用Model接口\n", "\n", "对于简单场景的神经网络,可以在定义`Model`时指定前向网络`network`、损失函数`loss_fn`、优化器`optimizer`和评价函数`metrics`。\n", "\n", "此时,`Model`会使用`network`作为前向网络,并使用`nn.WithLossCell`和`nn.TrainOneStepCell`构建训练网络,使用`nn.WithEvalCell`构建评估网络。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2022-01-04T06:43:30.392367Z", "start_time": "2022-01-04T06:43:28.436687Z" } }, "outputs": [], "source": [ "import numpy as np\n", "import mindspore.dataset as ds\n", "import mindspore.nn as nn\n", "import mindspore as ms\n", "from mindspore.common.initializer import Normal\n", "\n", "def get_data(num, w=2.0, b=3.0):\n", " \"\"\"生成样本数据及对应的标签\"\"\"\n", " for _ in range(num):\n", " x = np.random.uniform(-10.0, 10.0)\n", " noise = np.random.normal(0, 1)\n", " y = x * w + b + noise\n", " yield np.array([x]).astype(np.float32), np.array([y]).astype(np.float32)\n", "\n", "def create_dataset(num_data, batch_size=16):\n", " \"\"\"生成数据集\"\"\"\n", " dataset = ds.GeneratorDataset(list(get_data(num_data)), column_names=['data', 'label'])\n", " dataset = dataset.batch(batch_size)\n", " return dataset\n", "\n", "class LinearNet(nn.Cell):\n", " \"\"\"定义线性回归网络\"\"\"\n", " def __init__(self):\n", " super().__init__()\n", " self.fc = nn.Dense(1, 1, Normal(0.02), Normal(0.02))\n", "\n", " def construct(self, x):\n", " return self.fc(x)\n", "\n", "train_dataset = create_dataset(num_data=160)\n", "net = LinearNet()\n", "crit = nn.MSELoss()\n", "opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9)\n", "\n", "# 使用Model构建训练网络\n", "model = ms.Model(network=net, loss_fn=crit, optimizer=opt, metrics={\"mae\"})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 模型训练\n", "\n", "使用`train`接口执行模型训练,`train`接口的常用参数如下:\n", "\n", "- `epoch`:训练执行轮次,通常每个epoch都会使用全量数据集进行训练。\n", "- `train_dataset`:一个训练数据集迭代器。\n", "- `callbacks`:训练过程中需要执行的回调对象或者回调对象列表。\n", "\n", "有意思的是,如果网络模型定义了`loss_fn`,则数据和标签会被分别传给`network`和`loss_fn`,此时数据集需要返回一个元组(data, label)。如果数据集中有多个数据或者标签,可以设置`loss_fn`为None,并在`network`中实现自定义损失函数,此时数据集返回的所有数据组成的元组(data1, data2, data3, …)会传给`network`。\n", "\n", "如下示例使用`train`接口执行模型训练,通过`LossMonitor`回调函数查看在训练过程中的损失函数值。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2022-01-04T06:43:30.952190Z", "start_time": "2022-01-04T06:43:30.525149Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch:[ 0/ 1], step:[ 1/ 10], loss:[115.354/115.354], time:242.467 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 2/ 10], loss:[86.149/100.751], time:0.650 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 3/ 10], loss:[17.299/72.934], time:0.712 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 4/ 10], loss:[21.070/59.968], time:0.744 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 5/ 10], loss:[42.781/56.530], time:0.645 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 6/ 10], loss:[52.374/55.838], time:0.577 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 7/ 10], loss:[53.629/55.522], time:0.588 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 8/ 10], loss:[16.356/50.626], time:0.624 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 9/ 10], loss:[5.504/45.613], time:0.730 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 10/ 10], loss:[5.396/41.591], time:0.766 ms, lr:0.00500\n", "Epoch time: 259.696 ms, per step time: 25.970 ms, avg loss: 41.591\n" ] } ], "source": [ "from mindvision.engine.callback import LossMonitor\n", "\n", "# 模型训练,LossMonitor的入参0.005为学习率\n", "model.train(1, train_dataset, callbacks=[LossMonitor(0.005)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 模型评估\n", "\n", "使用`eval`接口进行评估,`eval`接口参数如下:\n", "\n", "- `valid_dataset`:评估模型的数据集。\n", "- `callbacks`:评估过程中需要执行的回调对象或回调对象列表。\n", "- `dataset_sink_mode`:数据是否直接下沉至处理器进行处理。" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2022-01-04T06:43:31.009605Z", "start_time": "2022-01-04T06:43:30.954322Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'mae': 4.2325128555297855}\n" ] } ], "source": [ "eval_dataset = create_dataset(num_data=80) # 创建评估数据集\n", "eval_result = model.eval(eval_dataset) # 执行模型评估\n", "print(eval_result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 模型推理\n", "\n", "使用`predict`接口进行推理,`predict`接口参数如下:\n", "\n", "- `predict_data`:预测样本,数据可以是单个张量、张量列表或张量元组。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2022-01-04T06:43:31.042129Z", "start_time": "2022-01-04T06:43:31.011659Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-6.9463778 ]\n", " [ 1.3816066 ]\n", " [13.233659 ]\n", " [11.863918 ]\n", " [ 0.73616135]\n", " [-0.1280173 ]\n", " [ 7.579297 ]\n", " [-4.9149694 ]\n", " [ 7.416003 ]\n", " [10.491856 ]\n", " [-5.7275047 ]\n", " [ 9.984399 ]\n", " [-7.156473 ]\n", " [ 2.7091386 ]\n", " [-6.3339615 ]\n", " [-6.0259247 ]]\n" ] } ], "source": [ "eval_data = eval_dataset.create_dict_iterator()\n", "data = next(eval_data)\n", "# 执行模型预测\n", "output = model.predict(data[\"data\"])\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "一般情况下需要对推理结果进行后处理才能得到比较直观的推理结果。\n", "\n", "## 自定义场景\n", "\n", "MindSpore提供的网络封装函数`nn.WithLossCell`、`nn.TrainOneStepCell`和`nn.WithEvalCell`并不适用于所有场景,实际场景中常常需要自定义网络的封装函数,这种情况下`Model`使用这些封装函数自动地进行封装显然是不合理的。\n", "\n", "接下来介绍在自定义网络封装函数时如何正确地使用`Model`。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 自定义损失网络\n", "\n", "在有多个数据或者多个标签的场景下,可以使用自定义损失网络将前向网络和自定义的损失函数链接起来作为`Model`的`network`,`loss_fn`使用默认值`None`,此时`Model`内部不会经过`nn.WithLossCell`,而会直接使用`nn.TrainOneStepCell`将`network`与`optimizer`组成训练网络。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2022-01-04T06:43:31.051039Z", "start_time": "2022-01-04T06:43:31.043718Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch:[ 0/ 1], step:[ 1/ 10], loss:[11.036/11.036], time:212.864 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 2/ 10], loss:[9.984/10.510], time:0.592 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 3/ 10], loss:[9.300/10.107], time:0.660 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 4/ 10], loss:[7.526/9.462], time:0.787 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 5/ 10], loss:[6.959/8.961], time:0.715 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 6/ 10], loss:[10.290/9.183], time:0.716 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 7/ 10], loss:[10.067/9.309], time:0.770 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 8/ 10], loss:[8.924/9.261], time:0.909 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 9/ 10], loss:[7.257/9.038], time:0.884 ms, lr:0.00500\n", "Epoch:[ 0/ 1], step:[ 10/ 10], loss:[6.138/8.748], time:0.955 ms, lr:0.00500\n", "Epoch time: 232.046 ms, per step time: 23.205 ms, avg loss: 8.748\n" ] } ], "source": [ "import numpy as np\n", "import mindspore.dataset as ds\n", "import mindspore.ops as ops\n", "import mindspore.nn as nn\n", "import mindspore as ms\n", "from mindspore.nn import LossBase\n", "from mindvision.engine.callback import LossMonitor\n", "\n", "def get_multilabel_data(num, w=2.0, b=3.0):\n", " \"\"\"生成多标签数据,产生一组数据x对应两个标签y1和y2\"\"\"\n", " for _ in range(num):\n", " x = np.random.uniform(-10.0, 10.0)\n", " noise1 = np.random.normal(0, 1)\n", " noise2 = np.random.normal(-1, 1)\n", " y1 = x * w + b + noise1\n", " y2 = x * w + b + noise2\n", " yield np.array([x]).astype(np.float32), np.array([y1]).astype(np.float32), np.array([y2]).astype(np.float32)\n", "\n", "def create_multilabel_dataset(num_data, batch_size=16):\n", " \"\"\"生成多标签数据集,一个数据data对应两个标签label1和label2\"\"\"\n", " dataset = ds.GeneratorDataset(list(get_multilabel_data(num_data)), column_names=['data', 'label1', 'label2'])\n", " dataset = dataset.batch(batch_size)\n", " return dataset\n", "\n", "class L1LossForMultiLabel(LossBase):\n", " \"\"\"自定义多标签损失函数\"\"\"\n", "\n", " def __init__(self, reduction=\"mean\"):\n", " super(L1LossForMultiLabel, self).__init__(reduction)\n", " self.abs = ops.Abs()\n", "\n", " def construct(self, base, target1, target2):\n", " \"\"\"输入有三个,分别为预测值base,真实值target1和target2\"\"\"\n", " x1 = self.abs(base - target1)\n", " x2 = self.abs(base - target2)\n", " return self.get_loss(x1) / 2 + self.get_loss(x2) / 2\n", "\n", "class CustomWithLossCell(nn.Cell):\n", " \"\"\"连接前向网络和损失函数\"\"\"\n", "\n", " def __init__(self, backbone, loss_fn):\n", " \"\"\"输入有两个,前向网络backbone和损失函数loss_fn\"\"\"\n", " super(CustomWithLossCell, self).__init__(auto_prefix=False)\n", " self._backbone = backbone\n", " self._loss_fn = loss_fn\n", "\n", " def construct(self, data, label1, label2):\n", " output = self._backbone(data) # 前向计算得到网络输出\n", " return self._loss_fn(output, label1, label2) # 得到多标签损失值\n", "\n", "multi_train_dataset = create_multilabel_dataset(num_data=160)\n", "\n", "# 构建线性回归网络\n", "net = LinearNet()\n", "# 多标签损失函数\n", "loss = L1LossForMultiLabel()\n", "\n", "# 连接线性回归网络和多标签损失函数\n", "loss_net = CustomWithLossCell(net, loss)\n", "opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9)\n", "\n", "# 使用Model连接网络和优化器,此时Model内部不经过nn.WithLossCell\n", "model = ms.Model(network=loss_net, optimizer=opt)\n", "# 使用train接口进行模型训练\n", "model.train(epoch=1, train_dataset=multi_train_dataset, callbacks=[LossMonitor(0.005)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 自定义训练网络\n", "\n", "在自定义训练网络时,需要手动构建训练网络作为`Model`的`network`,`loss_fn`和`optimizer`均使用默认值`None`,此时`Model`会使用`network`作为训练网络,而不会进行任何封装。\n", "\n", "如下示例自定义训练网络`CustomTrainOneStepCell`,然后通过`Model`接口构建训练网络。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch:[ 0/ 1], step:[ 1/ 10], loss:[5.165/5.165], time:183.006 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 2/ 10], loss:[4.042/4.603], time:0.800 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 3/ 10], loss:[3.385/4.197], time:0.886 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 4/ 10], loss:[2.438/3.758], time:0.896 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 5/ 10], loss:[2.457/3.498], time:0.819 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 6/ 10], loss:[2.546/3.339], time:0.921 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 7/ 10], loss:[4.569/3.515], time:0.973 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 8/ 10], loss:[4.031/3.579], time:1.271 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 9/ 10], loss:[6.138/3.864], time:1.035 ms, lr:0.01000\n", "Epoch:[ 0/ 1], step:[ 10/ 10], loss:[3.055/3.783], time:1.263 ms, lr:0.01000\n", "Epoch time: 203.473 ms, per step time: 20.347 ms, avg loss: 3.783\n" ] } ], "source": [ "import mindspore.ops as ops\n", "import mindspore as ms\n", "from mindvision.engine.callback import LossMonitor\n", "\n", "class CustomTrainOneStepCell(nn.Cell):\n", " \"\"\"自定义训练网络\"\"\"\n", "\n", " def __init__(self, network, optimizer, sens=1.0):\n", " \"\"\"入参有三个:训练网络,优化器和反向传播缩放比例\"\"\"\n", " super(CustomTrainOneStepCell, self).__init__(auto_prefix=False)\n", " self.network = network # 定义前向网络\n", " self.network.set_grad() # 构建反向网络\n", " self.optimizer = optimizer # 定义优化器\n", " self.weights = self.optimizer.parameters # 待更新参数\n", " self.grad = ops.GradOperation(get_by_list=True, sens_param=True) # 反向传播获取梯度\n", "\n", " def construct(self, *inputs):\n", " loss = self.network(*inputs) # 执行前向网络,计算当前输入的损失函数值\n", " grads = self.grad(self.network, self.weights)(*inputs, loss) # 进行反向传播,计算梯度\n", " loss = ops.depend(loss, self.optimizer(grads)) # 使用优化器更新梯度\n", " return loss\n", "\n", "multi_train_ds = create_multilabel_dataset(num_data=160)\n", "\n", "# 手动构建训练网络\n", "train_net = CustomTrainOneStepCell(loss_net, opt)\n", "# 构建训练网络\n", "model = ms.Model(train_net)\n", "# 执行模型训练\n", "model.train(epoch=1, train_dataset=multi_train_ds, callbacks=[LossMonitor(0.01)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 自定义评估网络\n", "\n", "`Model`默认使用`nn.WithEvalCell`构建评估网络,在不满足需求的情况下需要手动构建评估网络,如多数据和多标签场景下。\n", "\n", "如下示例自定义评估网络`CustomWithEvalCell`,然后使用`Model`接口构建评估网络。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2022-01-04T06:43:31.373188Z", "start_time": "2022-01-04T06:43:31.369046Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'mae1': 2.5686439752578734, 'mae2': 2.4921266555786135}\n" ] } ], "source": [ "import mindspore.nn as nn\n", "import mindspore as ms\n", "\n", "\n", "class CustomWithEvalCell(nn.Cell):\n", " \"\"\"自定义多标签评估网络\"\"\"\n", "\n", " def __init__(self, network):\n", " super(CustomWithEvalCell, self).__init__(auto_prefix=False)\n", " self.network = network\n", "\n", " def construct(self, data, label1, label2):\n", " output = self.network(data)\n", " return output, label1, label2\n", "\n", "# 构建多标签评估数据集\n", "multi_eval_dataset = create_multilabel_dataset(num_data=80)\n", "\n", "# 构建评估网络\n", "eval_net = CustomWithEvalCell(net)\n", "\n", "# 评估函数\n", "mae1 = nn.MAE()\n", "mae2 = nn.MAE()\n", "mae1.set_indexes([0, 1])\n", "mae2.set_indexes([0, 2])\n", "\n", "# 使用Model构建评估网络\n", "model = ms.Model(network=loss_net, optimizer=opt, eval_network=eval_net,\n", " metrics={\"mae1\": mae1, \"mae2\": mae2})\n", "result = model.eval(multi_eval_dataset)\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "上述代码在进行模型评估时,评估网络的输出会透传给评估指标的`update`函数,`update`函数将接收到三个输入,分别为`logits`、`label1`和`label2`。\n", "\n", "`nn.MAE`仅允许在两个输入上计算评价指标,因此使用`set_indexes`指定`mae1`使用下标为0和1的输入,也就是`logits`和`label1`,计算评估结果;指定`mae2`使用下标为0和2的输入,也就是`logits`和`label2`,计算评估结果。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 网络推理\n", "\n", " `Model`没有提供用于指定自定义推理网络的参数,此时可以直接运行前向网络获得推理结果。" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2022-01-04T06:43:31.481326Z", "start_time": "2022-01-04T06:43:31.449243Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-21.598358 ]\n", " [ -1.0123782]\n", " [ 10.457726 ]\n", " [ 12.409237 ]\n", " [ 19.666183 ]\n", " [ -5.846529 ]\n", " [ 9.387393 ]\n", " [ 2.6558673]\n", " [-15.15129 ]\n", " [-14.876989 ]\n", " [ 19.112661 ]\n", " [ 22.647848 ]\n", " [ 4.9035554]\n", " [ 20.119627 ]\n", " [ -8.339532 ]\n", " [ -2.7513359]]\n" ] } ], "source": [ "for d in multi_eval_dataset.create_dict_iterator():\n", " data = d[\"data\"]\n", " break\n", "\n", "output = net(data)\n", "print(output)" ] } ], "metadata": { "kernelspec": { "display_name": "MindSpore", "language": "python", "name": "mindspore" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }