{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 网络参数\n", "\n", "[![下载Notebook](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/advanced/network/mindspore_parameter.ipynb) [![下载样例代码](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_download_code.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/r1.8/tutorials/zh_cn/advanced/network/mindspore_parameter.py) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.8/tutorials/source_zh_cn/advanced/network/parameter.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "MindSpore提供了关于变量、网络相关参数的初始化模块,用户可以通过封装算子来调用字符串、Initializer子类或自定义Tensor等方式完成对网络参数进行初始化。\n", "\n", "下面图中蓝色表示具体的执行算子,绿色的表示张量Tensor,张量作为神经网络模型中的数据在网络中不断流动,主要包括网络模型的数据输入,算子的输入输出数据等;红色的为变量Parameter,作为网络模型或者模型中算子的属性,及其反向图中产生的中间变量和临时变量。\n", "\n", "![parameter.png](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.8/tutorials/source_zh_cn/advanced/network/images/parameter.png)\n", "\n", "本章主要介绍数据类型`dtype`、变量`Parameter`、变量元组`ParameterTuple`、网络的初始化方法和网络参数更新。\n", "\n", "## 数据类型 dtype\n", "\n", "MindSpore张量支持不同的数据类型`dtype`,包含int8、int16、int32、int64、uint8、uint16、uint32、uint64、float16、float32、float64、bool_,与NumPy的数据类型一一对应。详细的数据类型支持情况请参考:[mindspore.dtype](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore/mindspore.dtype.html#mindspore.dtype)。\n", "\n", "> 在MindSpore的运算处理流程中,Python中的int数会被转换为定义的int64类型,float数会被转换为定义的float32类型。\n", "\n", "以下代码,打印MindSpore的数据类型int32。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Int32\n" ] } ], "source": [ "import mindspore as ms\n", "\n", "data_type = ms.int32\n", "print(data_type)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 数据类型转换接口\n", "\n", "MindSpore提供了以下几个接口,实现与NumPy数据类型和Python内置的数据类型间的转换。\n", "\n", "- `dtype_to_nptype`:将MindSpore的数据类型转换为NumPy对应的数据类型。\n", "- `dtype_to_pytype`:将MindSpore的数据类型转换为Python对应的内置数据类型。\n", "- `pytype_to_dtype`:将Python内置的数据类型转换为MindSpore对应的数据类型。\n", "\n", "以下代码实现了不同数据类型间的转换,并打印转换后的类型。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64\n", "\n" ] } ], "source": [ "import mindspore as ms\n", "\n", "np_type = ms.dtype_to_nptype(ms.int32)\n", "ms_type = ms.pytype_to_dtype(int)\n", "py_type = ms.dtype_to_pytype(ms.float64)\n", "\n", "print(np_type)\n", "print(ms_type)\n", "print(py_type)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 变量 Parameter\n", "\n", "MindSpore的变量([Parameter](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore/mindspore.Parameter.html#mindspore.Parameter))表示在网络训练时,需要被更新的参数。例如,在前行计算的时候最常见的`nn.conv`算子的变量有权重`weight`和偏置`bias`;在构建反向图和反向传播计算的时候,会产生很多中间变量,用于暂存一阶梯度信息、中间输出值等。\n", "\n", "### 变量初始化\n", "\n", "变量`Parameter`的初始化方法有很多种,可以接收`Tensor`、`Initializer`等不同的数据类型。\n", "\n", "- `default_input`:为输入数据,支持传入`Tensor`、`Initializer`、`int`和`float`四种数据类型;\n", "- `name`:可设置变量的名称,用于在网络中区别于其他变量;\n", "- `requires_grad`:表示在网络训练过程,是否需要计算参数梯度,如果不需要计算参数梯度,将`requires_grad`设置为`False`。\n", "\n", "下面的示例代码中,使用`int`或`float`数据类型直接创建Parameter:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Parameter (name=x, shape=(), dtype=Float32, requires_grad=True) value: 2.0\n", "Parameter (name=y, shape=(), dtype=Float32, requires_grad=True) value: 5.0\n", "Parameter (name=z, shape=(), dtype=Int32, requires_grad=False) value: 5\n" ] } ], "source": [ "import mindspore as ms\n", "\n", "x = ms.Parameter(default_input=2.0, name='x')\n", "y = ms.Parameter(default_input=5.0, name='y')\n", "z = ms.Parameter(default_input=5, name='z', requires_grad=False)\n", "\n", "print(type(x))\n", "print(x, \"value:\", x.asnumpy())\n", "print(y, \"value:\", y.asnumpy())\n", "print(z, \"value:\", z.asnumpy())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "下面示例代码中,使用MindSpore的张量`Tensor`创建Parameter:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameter (name=tensor, shape=(2, 3), dtype=Int64, requires_grad=True)\n" ] } ], "source": [ "import numpy as np\n", "import mindspore as ms\n", "\n", "my_tensor = ms.Tensor(np.arange(2 * 3).reshape((2, 3)))\n", "x = ms.Parameter(default_input=my_tensor, name=\"tensor\")\n", "\n", "print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "下面示例代码中,使用`Initializer`创建Parameter:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameter (name=x, shape=(1, 2, 3), dtype=Float32, requires_grad=True)\n" ] } ], "source": [ "from mindspore.common.initializer import initializer as init\n", "import mindspore as ms\n", "\n", "x = ms.Parameter(default_input=init('ones', [1, 2, 3], ms.float32), name='x')\n", "print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 属性\n", "\n", "变量`Parameter`的默认属性有变量名称`name`、形状`shape`、数据类型`dtype`和是否需要进行求导`requires_grad`。\n", "\n", "下例通过`Tensor`初始化一个变量`Parameter`,并获取变量`Parameter`的相关属性。示例代码如下:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "x: Parameter (name=x, shape=(2, 3), dtype=Int64, requires_grad=True)\n", "x.data: Parameter (name=x, shape=(2, 3), dtype=Int64, requires_grad=True)\n" ] } ], "source": [ "my_tensor = ms.Tensor(np.arange(2 * 3).reshape((2, 3)))\n", "x = ms.Parameter(default_input=my_tensor, name=\"x\")\n", "\n", "print(\"x: \", x)\n", "print(\"x.data: \", x.data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 变量操作\n", "\n", "1. `clone`:克隆变量张量`Parameter`,克隆完成后可以给新的变量`Parameter`指定新的名称。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameter (name=Parameter, shape=(1, 2, 3), dtype=Float32, requires_grad=True)\n", "Parameter (name=x_clone, shape=(1, 2, 3), dtype=Float32, requires_grad=True)\n" ] } ], "source": [ "x = ms.Parameter(default_input=init('ones', [1, 2, 3], ms.float32))\n", "x_clone = x.clone()\n", "x_clone.name = \"x_clone\"\n", "\n", "print(x)\n", "print(x_clone)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. `set_data`:修改变量`Parameter`的数据或形状`shape`。\n", "\n", "其中,`set_data`方法有`data`和`slice_shape`两种入参。`data`表示变量`Parameter`新传入的数据;`slice_shape`表示是否修改变量`Parameter`的形状`shape`,默认为False。" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameter (name=x, shape=(1, 2), dtype=Float32, requires_grad=True) [[1. 1.]]\n", "Parameter (name=x, shape=(1, 2), dtype=Float32, requires_grad=True) [[0. 0.]]\n", "Parameter (name=x, shape=(1, 4), dtype=Float32, requires_grad=True) [[1. 1. 1. 1.]]\n" ] } ], "source": [ "x = ms.Parameter(ms.Tensor(np.ones((1, 2)), ms.float32), name=\"x\", requires_grad=True)\n", "print(x, x.asnumpy())\n", "\n", "y = x.set_data(ms.Tensor(np.zeros((1, 2)), ms.float32))\n", "print(y, y.asnumpy())\n", "\n", "z = x.set_data(ms.Tensor(np.ones((1, 4)), ms.float32), slice_shape=True)\n", "print(z, z.asnumpy())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. `init_data`:并行场景下存在参数的形状发生变化的情况,用户可以调用`Parameter`的`init_data`方法得到原始数据。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameter (name=x, shape=(1, 2), dtype=Float32, requires_grad=True) [[1. 1.]]\n" ] } ], "source": [ "x = ms.Parameter(ms.Tensor(np.ones((1, 2)), ms.float32), name=\"x\", requires_grad=True)\n", "\n", "print(x.init_data(), x.init_data().asnumpy())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 变量参数更新\n", "\n", "MindSpore提供了网络参数更新功能,使用`nn.ParameterUpdate`可对网络参数进行更新,其输入的参数类型必须为张量,且张量`shape`需要与原网络参数`shape`保持一致。\n", "\n", "更新网络的权重参数示例如下:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameter:\n", " [[-0.0164615 -0.01204428 -0.00813806]\n", " [-0.00270927 -0.0113328 -0.01384139]\n", " [ 0.00849093 0.00351116 0.00989969]\n", " [ 0.00233028 0.00649209 -0.0021333 ]]\n", "Parameter update:\n", " [[ 0. 1. 2.]\n", " [ 3. 4. 5.]\n", " [ 6. 7. 8.]\n", " [ 9. 10. 11.]]\n" ] } ], "source": [ "import numpy as np\n", "import mindspore as ms\n", "from mindspore import nn\n", "\n", "# 构建网络\n", "network = nn.Dense(3, 4)\n", "\n", "# 获取网络的权重参数\n", "param = network.parameters_dict()['weight']\n", "print(\"Parameter:\\n\", param.asnumpy())\n", "\n", "# 更新权重参数\n", "update = nn.ParameterUpdate(param)\n", "weight = ms.Tensor(np.arange(12).reshape((4, 3)), ms.float32)\n", "output = update(weight)\n", "print(\"Parameter update:\\n\", output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 变量元组 Parameter Tuple\n", "\n", "变量元组[ParameterTuple](https://www.mindspore.cn/docs/zh-CN/r1.8/api_python/mindspore/mindspore.ParameterTuple.html#mindspore.ParameterTuple),用于保存多个`Parameter`,继承于元组`tuple`,提供克隆功能。\n", "\n", "如下示例提供`ParameterTuple`创建方法:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(Parameter (name=x, shape=(2, 3), dtype=Int64, requires_grad=True), Parameter (name=y, shape=(1, 2, 3), dtype=Float32, requires_grad=True), Parameter (name=z, shape=(), dtype=Float32, requires_grad=True))\n", "(Parameter (name=params_copy.x, shape=(2, 3), dtype=Int64, requires_grad=True), Parameter (name=params_copy.y, shape=(1, 2, 3), dtype=Float32, requires_grad=True), Parameter (name=params_copy.z, shape=(), dtype=Float32, requires_grad=True))\n" ] } ], "source": [ "import numpy as np\n", "import mindspore as ms\n", "from mindspore.common.initializer import initializer\n", "\n", "# 创建\n", "x = ms.Parameter(default_input=ms.Tensor(np.arange(2 * 3).reshape((2, 3))), name=\"x\")\n", "y = ms.Parameter(default_input=initializer('ones', [1, 2, 3], ms.float32), name='y')\n", "z = ms.Parameter(default_input=2.0, name='z')\n", "params = ms.ParameterTuple((x, y, z))\n", "\n", "# 从params克隆并修改名称为\"params_copy\"\n", "params_copy = params.clone(\"params_copy\")\n", "\n", "print(params)\n", "print(params_copy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 网络参数初始化\n", "\n", "MindSpore提供了多种网络参数初始化的方式,并在部分算子中封装了参数初始化的功能。本节以`Conv2d`算子为例,分别介绍使用`Initializer`子类,字符串和自定义`Tensor`等方式对网络中的参数进行初始化。\n", "\n", "### Initializer初始化\n", "\n", "使用`Initializer`对网络参数进行初始化,示例代码如下:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import mindspore.nn as nn\n", "import mindspore as ms\n", "from mindspore.common import initializer as init\n", "\n", "ms.set_seed(1)\n", "\n", "input_data = ms.Tensor(np.ones([1, 3, 16, 50], dtype=np.float32))\n", "# 卷积层,输入通道为3,输出通道为64,卷积核大小为3*3,权重参数使用正态分布生成的随机数\n", "net = nn.Conv2d(3, 64, 3, weight_init=init.Normal(0.2))\n", "# 网络输出\n", "output = net(input_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 字符串初始化\n", "\n", "使用字符串对网络参数进行初始化,字符串的内容需要与`Initializer`的名称保持一致(字母不区分大小写),使用字符串方式进行初始化将使用`Initializer`类中的默认参数,例如使用字符串`Normal`等同于使用`Initializer`的`Normal()`,示例如下:\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import mindspore.nn as nn\n", "import mindspore as ms\n", "\n", "ms.set_seed(1)\n", "\n", "input_data = ms.Tensor(np.ones([1, 3, 16, 50], dtype=np.float32))\n", "net = nn.Conv2d(3, 64, 3, weight_init='Normal')\n", "output = net(input_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 张量初始化\n", "\n", "用户也可以通过自定义`Tensor`的方式,来对网络模型中算子的参数进行初始化,示例代码如下:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import mindspore.nn as nn\n", "import mindspore as ms\n", "\n", "init_data = ms.Tensor(np.ones([64, 3, 3, 3]), dtype=ms.float32)\n", "input_data = ms.Tensor(np.ones([1, 3, 16, 50], dtype=np.float32))\n", "\n", "net = nn.Conv2d(3, 64, 3, weight_init=init_data)\n", "output = net(input_data)" ] } ], "metadata": { "kernelspec": { "display_name": "MindSpore", "language": "python", "name": "mindspore" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }