mindspore.device_context.gpu.op_tuning.conv_wgrad_algo

View Source On Gitee
mindspore.device_context.gpu.op_tuning.conv_wgrad_algo(mode)[source]

Specifies convolution filter grad algorithm. For detailed information, please refer to NVIDA cuDNN.

Parameters

mode (-) –

convolution filter grad algorithm. If not configured, the framework defaults to 'normal'. The value range is as follows:

  • normal: Use the cuDNN's heuristic search algorithm, the appropriate convolution algorithm will be quickly selected based on the convolution shape and type. This parameter does not guarantee optimal performance.

  • performance: Use the cuDNN's trial search algorithm, all convolution algorithms will be trial run based on the convolution shape and type, and the optimal algorithm will be selected. This parameter ensures optimal performance.

  • algo_0: This algorithm expresses the convolution as a sum of matrix products without actually explicitly forming the matrix that holds the input tensor data. The sum is done using the atomic add operation, thus the results are non-deterministic.

  • algo_1: This algorithm expresses the convolution as a matrix product without actually explicitly forming the matrix that holds the input tensor data. The results are deterministic.

  • algo_3: This algorithm is similar to algo_0 but uses some small workspace to precompute some indices. The results are also non-deterministic.

  • fft: This algorithm uses a Fast-Fourier Transform approach to compute the convolution. A significant memory workspace is needed to store intermediate results. The results are deterministic.

  • fft_tiling: This algorithm uses the Fast-Fourier Transform approach but splits the inputs into tiles. A significant memory workspace is needed to store intermediate results but less than fft for large size images. The results are deterministic.

  • winograd_nonfused: This algorithm uses the Winograd Transform approach to compute the convolution. A significant workspace may be needed to store intermediate results. The results are deterministic.

Examples

>>> import mindspore as ms
>>> ms.device_count.gpu.op_tuning.conv_wgrad_algo("performance")