{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 创建网络\n", "\n", "[![下载Notebook](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/beginner/mindspore_model.ipynb) [![下载样例代码](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_download_code.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/beginner/mindspore_model.py) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.8/tutorials/source_zh_cn/beginner/model.ipynb)\n", "\n", "神经网络模型由多个数据操作层组成,`mindspore.nn`提供了各种网络基础模块。本章以构建LeNet-5网络为例,先展示使用`mindspore.nn`建立神经网络模型,再展示使用`mindvision.classification.models`快速构建LeNet-5网络模型。\n", "\n", "> `mindvision.classification.models`是基于`mindspore.nn`开发的网络模型接口,提供了一些经典且常用的网络模型,方便用户使用。\n", "\n", "## LeNet-5模型\n", "\n", "[LeNet-5](https://ieeexplore.ieee.org/document/726791)是Yann LeCun教授于1998年提出的一种典型的卷积神经网络,在MNIST数据集上达到99.4%准确率,是CNN领域的第一篇经典之作。其模型结构如下图所示:\n", "\n", "![LeNet-5](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/tutorials/source_zh_cn/beginner/images/lenet.png)\n", "\n", "按照LeNet的网络结构,LeNet除去输入层共有7层,其中有2个卷积层,2个子采样层,3个全连接层。\n", "\n", "## 定义模型类\n", "\n", "上图中用C代表卷积层,用S代表采样层,用F代表全连接层。\n", "\n", "图片的输入size固定在$32*32$,为了获得良好的卷积效果,要求数字在图片的中央,所以输入$32*32$其实为$28*28$图片填充后的结果。另外不像CNN网络三通道的输入图片,LeNet图片的输入仅是规范化后的二值图像。网络的输出为0\\~9十个数字的预测概率,可以理解为输入图像属于0\\~9数字的可能性大小。\n", "\n", "MindSpore的`Cell`类是构建所有网络的基类,也是网络的基本单元。构建神经网络时,需要继承`Cell`类,并重写`__init__`方法和`construct`方法。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "import mindspore.nn as nn\n", "\n", "class LeNet5(nn.Cell):\n", " \"\"\"\n", " LeNet-5网络结构\n", " \"\"\"\n", " def __init__(self, num_class=10, num_channel=1):\n", " super(LeNet5, self).__init__()\n", " # 卷积层,输入的通道数为num_channel,输出的通道数为6,卷积核大小为5*5\n", " self.conv1 = nn.Conv2d(num_channel, 6, 5, pad_mode='valid')\n", " # 卷积层,输入的通道数为6,输出的通道数为16,卷积核大小为5*5\n", " self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')\n", " # 全连接层,输入个数为16*5*5,输出个数为120\n", " self.fc1 = nn.Dense(16 * 5 * 5, 120)\n", " # 全连接层,输入个数为120,输出个数为84\n", " self.fc2 = nn.Dense(120, 84)\n", " # 全连接层,输入个数为84,分类的个数为num_class\n", " self.fc3 = nn.Dense(84, num_class)\n", " # ReLU激活函数\n", " self.relu = nn.ReLU()\n", " # 池化层\n", " self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n", " # 多维数组展平为一维数组\n", " self.flatten = nn.Flatten()\n", "\n", " def construct(self, x):\n", " # 使用定义好的运算构建前向网络\n", " x = self.conv1(x)\n", " x = self.relu(x)\n", " x = self.max_pool2d(x)\n", " x = self.conv2(x)\n", " x = self.relu(x)\n", " x = self.max_pool2d(x)\n", " x = self.flatten(x)\n", " x = self.fc1(x)\n", " x = self.relu(x)\n", " x = self.fc2(x)\n", " x = self.relu(x)\n", " x = self.fc3(x)\n", " return x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "接下来建立上面定义的神经网络模型,并查看该网络模型的结构。" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "LeNet5<\n", " (conv1): Conv2d\n", " (conv2): Conv2d\n", " (fc1): Dense\n", " (fc2): Dense\n", " (fc3): Dense\n", " (relu): ReLU<>\n", " (max_pool2d): MaxPool2d\n", " (flatten): Flatten<>\n", " >\n" ] } ], "source": [ "model = LeNet5()\n", "\n", "print(model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 模型层\n", "\n", "本小节内容首先将会介绍LeNet-5网络中使用到`Cell`类的关键成员函数,然后通过实例化网络介绍如何利用`Cell`类访问模型参数,更多`Cell`类内容参考[mindspore.nn接口](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore.nn.html)。\n", "\n", "### nn.Conv2d\n", "\n", "加入`nn.Conv2d`层,给网络中加入卷积函数,帮助神经网络提取特征。" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 6, 32, 32)\n" ] } ], "source": [ "import numpy as np\n", "\n", "import mindspore as ms\n", "\n", "# 输入的通道数为1,输出的通道数为6,卷积核大小为5*5,使用normal算子初始化参数,不填充像素\n", "conv2d = nn.Conv2d(1, 6, 5, has_bias=False, weight_init='normal', pad_mode='same')\n", "input_x = ms.Tensor(np.ones([1, 1, 32, 32]), ms.float32)\n", "\n", "print(conv2d(input_x).shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### nn.ReLU\n", "\n", "加入`nn.ReLU`层,给网络中加入非线性的激活函数,帮助神经网络学习各种复杂的特征。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 2. 0. 2. 0.]\n" ] } ], "source": [ "relu = nn.ReLU()\n", "\n", "input_x = ms.Tensor(np.array([-1, 2, -3, 2, -1]), ms.float16)\n", "\n", "output = relu(input_x)\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### nn.MaxPool2d\n", "\n", "初始化`nn.MaxPool2d`层,将6×28×28的张量降采样为6×7×7的张量。" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 6, 7, 7)\n" ] } ], "source": [ "max_pool2d = nn.MaxPool2d(kernel_size=4, stride=4)\n", "input_x = ms.Tensor(np.ones([1, 6, 28, 28]), ms.float32)\n", "\n", "print(max_pool2d(input_x).shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### nn.Flatten\n", "\n", "初始化`nn.Flatten`层,将1×16×5×5的四维张量转换为400个连续元素的二维张量。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 400)\n" ] } ], "source": [ "flatten = nn.Flatten()\n", "input_x = ms.Tensor(np.ones([1, 16, 5, 5]), ms.float32)\n", "output = flatten(input_x)\n", "\n", "print(output.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### nn.Dense\n", "\n", "初始化`nn.Dense`层,对输入矩阵进行线性变换。" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 120)\n" ] } ], "source": [ "dense = nn.Dense(400, 120, weight_init='normal')\n", "input_x = ms.Tensor(np.ones([1, 400]), ms.float32)\n", "output = dense(input_x)\n", "\n", "print(output.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 模型参数\n", "\n", "网络内部的卷积层和全连接层等实例化后,即具有权重参数和偏置参数,这些参数会在训练过程中不断进行优化,在训练过程中可通过 `get_parameters()` 来查看网络各层的名字、形状、数据类型和是否反向计算等信息。" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "layer:backbone.conv1.weight, shape:(6, 1, 5, 5), dtype:Float32, requeires_grad:True\n", "layer:backbone.conv2.weight, shape:(16, 6, 5, 5), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc1.weight, shape:(120, 400), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc1.bias, shape:(120,), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc2.weight, shape:(84, 120), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc2.bias, shape:(84,), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc3.weight, shape:(10, 84), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc3.bias, shape:(10,), dtype:Float32, requeires_grad:True\n" ] } ], "source": [ "for m in model.get_parameters():\n", " print(f\"layer:{m.name}, shape:{m.shape}, dtype:{m.dtype}, requeires_grad:{m.requires_grad}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 快速构建LeNet-5网络模型\n", "\n", "上述介绍了使用`mindspore.nn.cell`构建LeNet-5网络模型,在`mindvision.classification.models`中已有构建好的网络模型接口,也可使用`lenet`接口直接构建LeNet-5网络模型。" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "layer:backbone.conv1.weight, shape:(6, 1, 5, 5), dtype:Float32, requeires_grad:True\n", "layer:backbone.conv2.weight, shape:(16, 6, 5, 5), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc1.weight, shape:(120, 400), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc1.bias, shape:(120,), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc2.weight, shape:(84, 120), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc2.bias, shape:(84,), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc3.weight, shape:(10, 84), dtype:Float32, requeires_grad:True\n", "layer:backbone.fc3.bias, shape:(10,), dtype:Float32, requeires_grad:True\n" ] } ], "source": [ "from mindvision.classification.models import lenet\n", "\n", "# num_classes表示分类的数量,pretrained表示是否使用预训练模型进行训练\n", "model = lenet(num_classes=10, pretrained=False)\n", "\n", "for m in model.get_parameters():\n", " print(f\"layer:{m.name}, shape:{m.shape}, dtype:{m.dtype}, requeires_grad:{m.requires_grad}\")" ] } ], "metadata": { "kernelspec": { "display_name": "MindSpore", "language": "python", "name": "mindspore" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }