mindspore_gs.ptq

Post training quantization algorithms.

import mindspore_gs.ptq as ptq

PTQ Config

mindspore_gs.ptq.PTQMode

Mode for ptq quantizer.

mindspore_gs.ptq.OutliersSuppressionType

Outliers suppression type for ptq quantizer.

mindspore_gs.ptq.PrecisionRecovery

Precision recovery algorithms.

mindspore_gs.ptq.PTQConfig

Config for post trainning quantization.

NetworkHelper

mindspore_gs.ptq.NetworkHelper

NetworkHelper for decoupling algorithm with network framework.

mindspore_gs.ptq.network_helpers.mf_net_helpers.MFLlama2Helper

Derived from 'NetworkHelper', a utility class for the MindFormers framework Llama2 network.

mindspore_gs.ptq.network_helpers.mf_net_helpers.MFParallelLlama2Helper

Derived from 'NetworkHelper', a utility class for the MindFormers framework ParrallelLlamaForCasualLM network.

Post Training Quantization Algorithm

mindspore_gs.ptq.PTQ

Implementation of PTQ algorithm which supports the combination quantization of activation, weight, and kvcache.

mindspore_gs.ptq.RoundToNearest

Native implementation for post training quantization based on min/max statistic values.