mindspore.device_context.gpu.op_tuning.conv_fprop_algo
- mindspore.device_context.gpu.op_tuning.conv_fprop_algo(mode)[source]
Specifies convolution forward algorithm. For detailed information, please refer to NVIDA cuDNN about cudnnConvolutionForward.
- Parameters
mode (str) –
convolution forward algorithm. If not configured, the framework defaults to 'normal'. The value range is as follows:
normal: Use the cuDNN's heuristic search algorithm, the appropriate convolution algorithm will be quickly selected based on the convolution shape and type. This parameter does not guarantee optimal performance.
performance: Use the cuDNN's trial search algorithm, all convolution algorithms will be trial run based on the convolution shape and type, and the optimal algorithm will be selected. This parameter ensures optimal performance.
implicit_gemm: This algorithm expresses the convolution as a matrix product without actually explicitly forming the matrix that holds the input tensor data.
precomp_gemm: This algorithm expresses convolution as a matrix product without actually explicitly forming the matrix that holds the input tensor data, but still needs some memory workspace to precompute some indices in order to facilitate the implicit construction of the matrix that holds the input tensor data.
gemm: This algorithm expresses the convolution as an explicit matrix product. A significant memory workspace is needed to store the matrix that holds the input tensor data.
direct: This algorithm expresses the convolution as a direct convolution, without implicitly or explicitly doing a matrix multiplication.
fft: This algorithm uses the Fast-Fourier Transform approach to compute the convolution. A significant memory workspace is needed to store intermediate results.
fft_tiling: This algorithm uses the Fast-Fourier Transform approach but splits the inputs into tiles. A significant memory workspace is needed to store intermediate results but less than fft algorithm for large size images.
winograd: This algorithm uses the Winograd Transform approach to compute the convolution. A reasonably sized workspace is needed to store intermediate results.
winograd_nonfused: This algorithm uses the Winograd Transform approach to compute the convolution. A significant workspace may be needed to store intermediate results.
Examples
>>> import mindspore as ms >>> ms.device_count.gpu.op_tuning.conv_fprop_algo("performance")