Function Differences with torch.nn.functional.gelu

torch.nn.functional.gelu

torch.nn.functional.gelu(input) -> Tensor

For more information, see torch.nn.functional.gelu.

mindspore.ops.gelu

mindspore.ops.gelu(input_x, approximate='none')

For more information, see mindspore.ops.gelu.

Differences

PyTorch: This function represents the Gaussian error linear unit function $G E L U (X) = X \times Φ (x)$ , where $Φ (x)$ is the cumulative distribution function of the Gaussian distribution. The input x denotes an arbitrary number of dimensions.

MindSpore: MindSpore API implements basically the same function as PyTorch.

Categories	Subcategories	PyTorch	MindSpore	Difference
Parameter	Parameter 1	-	approximate	There are two gelu approximation algorithms: ‘none’ and ‘tanh’, and the default value is ‘none’. After testing, the output is more similar to Pytorch when approximate is ‘none’.
Input	Single input	input	input_x	Same function, different parameter names

Code Example 1

The two APIs achieve the same function and have the same usage.

# PyTorch
import torch
input = torch.Tensor([[2, 4], [1, 2]])
output = torch.nn.functional.gelu(input)
print(output.detach().numpy())
# [[1.9544997 3.9998734]
#  [0.8413447 1.9544997]]

# MindSpore
import mindspore
import numpy as np
x = mindspore.Tensor(np.array([[2, 4], [1, 2]]), mindspore.float32)
output = mindspore.ops.gelu(x)
print(output)
# [[1.9545997 3.99993]
#  [0.841192 1.9545977]]