DNN一键转换成BNN

下载Notebook下载样例代码查看源文件

对于不熟悉贝叶斯模型的深度神经网络(Deep Neural Networks,简称DNN)研究人员,MindSpore Probability提供了高级APITransformToBNN,支持深度神经网络(Deep Neural Networks)模型一键转换成贝叶斯神经网络(Bayes Neural Networks,后续简称BNN)模型。目前在LeNet,ResNet,MobileNet,VGG等模型上验证了API的通用性。本例将会介绍如何使用transforms模块中的TransformToBNNAPI实现DNN一键转换成BNN。

整体流程如下:

  1. 定义DNN模型;

  2. 定义损失函数和优化器;

  3. 实现功能一:转换整个模型;

  4. 实现功能二:转换指定类型的层。

本例适用于GPU和Ascend环境。你可以在这里下载完整的样例代码:https://gitee.com/mindspore/mindspore/tree/r1.7/tests/st/probability/transforms

定义DNN模型

本例用到的深度神经网络(DNN)模型为LeNet5,定义完成后,打印其神经层的名称。由于转换层面上主要卷积层和池化层,本例也针对性地展示这两种计算层的转换信息。

[1]:
import mindspore.nn as nn
from mindspore.common.initializer import Normal

class LeNet5(nn.Cell):
    """Lenet network structure."""
    # define the operator required
    def __init__(self, num_class=10, num_channel=1):
        super(LeNet5, self).__init__()
        self.conv1 = nn.Conv2d(num_channel, 6, 5, pad_mode='valid')
        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')
        self.fc1 = nn.Dense(16 * 5 * 5, 120, weight_init=Normal(0.02))
        self.fc2 = nn.Dense(120, 84, weight_init=Normal(0.02))
        self.fc3 = nn.Dense(84, num_class, weight_init=Normal(0.02))
        self.relu = nn.ReLU()
        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()

    # use the preceding operators to construct networks
    def construct(self, x):
        x = self.max_pool2d(self.relu(self.conv1(x)))
        x = self.max_pool2d(self.relu(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        x = self.fc3(x)
        return x

对于经典的DNN模型LeNet5网络卷积层有两层conv1,conv2,全连接层为3层:fc1,fc2,fc3。

定义损失函数和优化器

本例中使用损失函数为交叉熵损失函数nn.SoftmaxCrossEntropyWithLogits,优化器为Adam函数即nn.AdamWeightDecay

由于需要将进行整个模型的BNN转换,所以需要将DNN网络,损失函数和优化器关联成一个完整的计算网络,即train_network

[2]:
import pprint
from mindspore.nn import WithLossCell, TrainOneStepCell
from mindspore.nn.probability import transforms
from mindspore import context

context.set_context(mode=context.GRAPH_MODE, device_target="GPU")

network = LeNet5()
lr = 0.01
momentum = 0.9
criterion = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean")
optimizer = nn.AdamWeightDecay(params=network.trainable_params(), learning_rate=0.0001)
#optimizer = nn.Momentum(network.trainable_params(), lr, momentum)
net_with_loss = WithLossCell(network, criterion)
train_network = TrainOneStepCell(net_with_loss, optimizer)

DNN_layer_name = [i.name for i in network.trainable_params()]
pprint.pprint(DNN_layer_name)
['conv1.weight',
 'conv2.weight',
 'fc1.weight',
 'fc1.bias',
 'fc2.weight',
 'fc2.bias',
 'fc3.weight',
 'fc3.bias']

上述打印信息即为当前未转换的卷积层和全连接层名称。

功能实现一:转换整个模型

转换整个模型使用到了transforms中的TransformToBNNAPI,一键转换完成后打印出模型中BNN的名称。

[3]:
bnn_transformer = transforms.TransformToBNN(train_network, 60000, 0.000001)
train_bnn_network = bnn_transformer.transform_to_bnn_model()

BNN_layer_name = [i.name for i in network.trainable_params()]
pprint.pprint(BNN_layer_name)
['conv1.weight_posterior.mean',
 'conv1.weight_posterior.untransformed_std',
 'conv2.weight_posterior.mean',
 'conv2.weight_posterior.untransformed_std',
 'fc1.weight_posterior.mean',
 'fc1.weight_posterior.untransformed_std',
 'fc1.bias_posterior.mean',
 'fc1.bias_posterior.untransformed_std',
 'fc2.weight_posterior.mean',
 'fc2.weight_posterior.untransformed_std',
 'fc2.bias_posterior.mean',
 'fc2.bias_posterior.untransformed_std',
 'fc3.weight_posterior.mean',
 'fc3.weight_posterior.untransformed_std',
 'fc3.bias_posterior.mean',
 'fc3.bias_posterior.untransformed_std']

上述打印信息即整体转换成贝叶斯网络(BNN)后的卷积层和全连名称。

功能实现二:转换指定类型的层

transform_to_bnn_layer方法可以将DNN模型中指定类型的层(nn.Dense或者nn.Conv2d)转换为对应的贝叶斯层。其定义如下:

transform_to_bnn_layer(dnn_layer, bnn_layer, get_args=None, add_args=None):

参数解释:

  • dnn_layer:指定将哪个类型的DNN层转换成BNN层。

  • bnn_layer:指定DNN层将转换成哪个类型的BNN层。

  • get_args:指定从DNN层中获取哪些参数。

  • add_args:指定为BNN层的哪些参数重新赋值。

在MindSpore中将DNN模型中的Dense层转换成相应贝叶斯层DenseReparam的代码如下:

[4]:
from mindspore.nn.probability import bnn_layers

network = LeNet5(10)
criterion = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean")
optimizer = nn.AdamWeightDecay(params=network.trainable_params(), learning_rate=0.0001)
net_with_loss = WithLossCell(network, criterion)
train_network = TrainOneStepCell(net_with_loss, optimizer)
bnn_transformer = transforms.TransformToBNN(train_network, 60000, 0.000001)
train_bnn_network = bnn_transformer.transform_to_bnn_layer(nn.Dense, bnn_layers.DenseReparam)

执行如下代码查看转换后网络的结构:

[5]:
DNN_layer_name = [i.name for i in network.trainable_params()]
pprint.pprint(DNN_layer_name)
['conv1.weight',
 'conv2.weight',
 'fc1.weight_posterior.mean',
 'fc1.weight_posterior.untransformed_std',
 'fc1.bias_posterior.mean',
 'fc1.bias_posterior.untransformed_std',
 'fc2.weight_posterior.mean',
 'fc2.weight_posterior.untransformed_std',
 'fc2.bias_posterior.mean',
 'fc2.bias_posterior.untransformed_std',
 'fc3.weight_posterior.mean',
 'fc3.weight_posterior.untransformed_std',
 'fc3.bias_posterior.mean',
 'fc3.bias_posterior.untransformed_std']

经过transform_to_bnn_layer转化后的LeNet网络,将指定的全连接层全部转换成了贝叶斯层。