mindspore_gs.ptq.PTQConfig

View Source On Gitee
class mindspore_gs.ptq.PTQConfig(mode = PTQMode.QUANTIZE, backend = BackendTarget.ASCEND, opname_blacklist = <class 'list'>, algo_args = <class 'dict'>, weight_quant_dtype = Int8, kvcache_quant_dtype = None, act_quant_dtype = None, outliers_suppression = OutliersSuppressionType.NONE)[source]

Config for post trainning quantization.

Parameters
  • mode (mindspore_gs.ptq.PTQMode) – Flag for ptq mode, QUANTIZATION for quantization mode, DEPLOY for deploy mode.

  • backend (mindspore_gs.common.BackendTarget) – Flag for backend target, NONE for no specific backend, ASCEND for ascend backend.

  • opname_blacklist (List[str]) – Blacklist of opname. Layers in network with name fuzzy matched with this blacklist will not being quanted.

  • algo_args (Union[dict, dataclass]) – Used to configure hyperparameters of algorithms such as RTN, SmoothQuant, and OmniQuant.

  • act_quant_dtype (mindspore.dtype) – Used to configure the quantization type of activation. mindspore.dtype.int8 indicates that the activation is quantized by 8 bits, and None indicates that it is not quantized.

  • weight_quant_dtype (mindspore.dtype) – Used to configure the quantization type of weight. mindspore.dtype.int8 indicates that the weight is quantized by 8 bits, and None indicates that it is not quantized.

  • kvcache_quant_dtype (mindspore.dtype) – Used to configure the quantization type of kvcache. mindspore.dtype.int8 indicates that the kvcache is quantized by 8 bits, and None indicates that it is not quantized.

  • outliers_suppression (mindspore_gs.ptq.OutliersSuppressionType) – Used to configure outliers suppression method before quantization. OutliersSuppressionType.SMOOTH indicates using smooth method from SmoothQuant to suppress outliers, and OutliersSuppressionType.NONE as default indicates doing nothing for outliers.

Raises
  • ValueError – If mode is not PTQMode.QUANTIZE or PTQMode.DEPLOY.

  • ValueError – If backend is not BackendTarget.NONE or BackendTarget.ASCEND.

  • TypeError – If opname_blacklist is not a list of str.

  • ValueError – If weight_quant_dtype is not mindspore.dtype.int8 or None.

  • ValueError – If kvcache_quant_dtype is not mindspore.dtype.int8 or None.

  • ValueError – If act_quant_dtype is not mindspore.dtype.int8 or None.

  • TypeError – If outliers_suppression is not a OutliersSuppressionType.

Examples

>>> from mindspore_gs.ptq import PTQConfig, PTQMode
>>> from mindspore_gs.common import BackendTarget
>>> PTQConfig(mode=PTQMode.DEPLOY, backend=BackendTarget.ASCEND, opname_blacklist=['layer0'])
PTQConfig(mode=<PTQMode.DEPLOY: 'deploy'>, backend=<BackendTarget.ASCEND: 'ascend'>, opname_blacklist=['layer0'], algo_args={})