Differences with torch.nn.Dropout

torch.nn.Dropout

torch.nn.Dropout(p=0.5, inplace=False)

For more information, see torch.nn.Dropout.

mindspore.nn.Dropout

mindspore.nn.Dropout(keep_prob=0.5, p=None, dtype=mstype.float32)

For more information, see mindspore.nn.Dropout.

Differences

PyTorch: Dropout is a regularization device. The operator randomly sets some neuron outputs to 0 during training according to the dropout probability p , reducing overfitting by preventing correlation between neuron nodes.

MindSpore: MindSpore API implements much the same functionality as PyTorch. keep_prob is the input neuron retention rate, now deprecated, will be removed in the near future version. dtype sets the data type of the output Tensor, now deprecated.

Categories	Subcategories	PyTorch	MindSpore	Difference
Parameters	Parameter 1	-	keep_prob	MindSpore discard parameter
	Parameter 2	p	p	The parameter names and functions are the same
	Parameter 3	inplace	-	MindSpore does not have this parameter
	Parameter 4	-	dtype	MindSpore discard parameter

Dropout is often used to prevent training overfitting. It has an important probability value parameter. The meaning of this parameter in MindSpore is completely opposite to that in PyTorch and TensorFlow.

In MindSpore, the probability value corresponds to the keep_prob attribute of the Dropout operator, indicating the probability that the input is retained. 1-keep_prob indicates the probability that the input is set to 0.

In PyTorch and TensorFlow, the probability values correspond to the attributes p and rate of the Dropout operator, respectively. They indicate the probability that the input is set to 0, which is opposite to the meaning of keep_prob in MindSpore.nn.Dropout.

In PyTorch, the network is in training mode by default, while in MindSpore, it’s in inference mode by default. Therefore, by default, Dropout called by the network does not take effect and directly returns the input. Dropout can be performed only after the network is set to the training mode by using the net.set_train() method.

Code Example

When the inplace input is False, both APIs achieve the same function.

# PyTorch
import torch
from torch import tensor
input = tensor([[1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
                [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
                [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
                [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
                [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00]])
output = torch.nn.Dropout(p=0.2, inplace=False)(input)
print(output.shape)
# torch.Size([5, 10])

# MindSpore
import mindspore
from mindspore import Tensor
x = Tensor([[1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
            [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
            [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
            [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00],
            [1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00]], mindspore.float32)
output = mindspore.nn.Dropout(p=0.2)(x)
print(output.shape)
# (5, 10)