mindspore_gs.ptq

Post training quantization algorithms.

import mindspore_gs.ptq as ptq

PTQ Config

mindspore_gs.ptq.PTQConfig

Config for post trainning quantization.

PTQMode Enum

mindspore_gs.ptq.PTQMode

Mode for ptq quantizer.

OutliersSuppressionType Enum

mindspore_gs.ptq.OutliersSuppressionType

Outliers suppression type for ptq quantizer.

NetworkHelper

mindspore_gs.ptq.NetworkHelper

NetworkHelper for decoupling algorithm with network framework.

mindspore_gs.ptq.network_helpers.mf_net_helpers.MFLlama2Helper

Derived from 'NetworkHelper', a utility class for the MindFormers framework Llama2 network.

mindspore_gs.ptq.network_helpers.mf_net_helpers.MFParallelLlama2Helper

Derived from 'NetworkHelper', a utility class for the MindFormers framework ParrallelLlamaForCasualLM network.

PTQ Algorithm

mindspore_gs.ptq.PTQ

Implementation of PTQ algorithm which supports the combination quantization of activation, weight, and kvcache.

RoundToNearest Algorithm

mindspore_gs.ptq.RoundToNearest

Native implementation for post training quantization based on min/max statistic values.