mindspore_gs.ptq.NetworkHelper

class mindspore_gs.ptq.NetworkHelper[源代码]

工具类,用于解耦算法层和网络框架层,使算法实现不依赖于具体的框架。

analysis_decoder_groups(network)[源代码]

分析网络中decoder组的信息。

参数:
  • network (Cell) - 要分析decoder组信息的网络。

样例:

>>> from mindspore import context
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper
>>> from mindformers import LlamaForCausalLM, LlamaConfig
>>> from mindformers.tools.register.config import MindFormerConfig
>>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend")
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config))
>>> helper.analysis_decoder_groups(network)
assemble_inputs(input_ids: np.ndarray, **kwargs)[源代码]

根据输入的tokens,编译网络推理所需的输入。

参数:
  • input_ids (numpy.ndarray) - 输入的tokens。

  • kwargs (Dict) - 用于子类可扩展入参。

返回:

一个 mindspore.Tensor 的列表,表示用于网络推理的输入。

样例:

>>> import numpy as np
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper
>>> from mindformers.tools.register.config import MindFormerConfig
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> input_ids = np.array([[1, 10000]], dtype = np.int32)
>>> helper.assemble_inputs(input_ids)
(Tensor(shape=[1, 4096], dtype=Int32, value=
[[    1, 10000,     0 ...     0,     0]]), None, None, None, None, None, None, None, None, None,              Tensor(shape=[1, 256], dtype=Int32, value=
[[  0,   1,   2 ... 253, 254, 255]]), Tensor(shape=[2], dtype=Int32, value= [0, 1]))
create_network()[源代码]

创建网络。

返回:

创建的网络。

样例:

>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper
>>> from mindformers.tools.register.config import MindFormerConfig
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> network = helper.create_network()
create_tokenizer(**kwargs)[源代码]

获取网络的分词器。

参数:
  • kwargs (Dict) - 用于子类可扩展入参。

返回:

一个对象,表示网络分词器。

样例:

>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper
>>> from mindformers.tools.register.config import MindFormerConfig
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> helper.create_tokenizer()
LlamaTokenizer(name_or_path='', vocab_size=32000, model_max_length=100000,  added_tokens_decoder={
0: AddedToken("<unk>", rstrip=False, lstrip=False, normalized=True, special=True),
1: AddedToken("<s>", rstrip=False, lstrip=False, normalized=True, special=True),
2: AddedToken("</s>", rstrip=False, lstrip=False, normalized=True, special=True),
})
generate(network: Cell, input_ids: Union[np.ndarray, List[int], List[List[int]]], max_new_tokens=None, **kwargs)[源代码]

对网络进行自递归式推理,生成一系列tokens。

参数:
  • network (Cell) - 进行自递归生成的网络。

  • input_ids (numpy.ndarray) - 用于生成的输入tokens。

  • max_new_tokens (int) - 最长生成长度,默认值为 1

  • kwargs (Dict) - 用于子类可扩展入参。

返回:

一个列表,表示生成的tokens。

样例:

>>> import numpy as np
>>> from mindspore import context
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper
>>> from mindformers import LlamaForCausalLM, LlamaConfig
>>> from mindformers.tools.register.config import MindFormerConfig
>>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend")
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config))
>>> input_ids = np.array([[1, 10000]], dtype = np.int32)
>>> helper.generate(network, input_ids)
array([[    1, 10000, 10001]], dtype=int32)
get_pre_layer(linear_name)[源代码]

通过当前linear层的名称,获取前一层的信息。

参数:
  • linear_name (str) - linear层名称。

返回:

一个字典,表示获取到的前一层layer的信息,包含了layer名称、layer和类型。

样例:

>>> from mindspore import context
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper
>>> from mindformers import LlamaForCausalLM, LlamaConfig
>>> from mindformers.tools.register.config import MindFormerConfig
>>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend")
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config))
>>> helper.analysis_decoder_groups(network)
>>> helper.get_pre_layer(linear_name)
get_spec(name: str)[源代码]

获取网络的规格,比如batch_size、seq_length等。

参数:
  • name (str) - 要获取的规格名称。

返回:

一个对象,表示获取到的网络规格。

样例:

>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper
>>> from mindformers.tools.register.config import MindFormerConfig
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> helper.get_spec("batch_size")
1 (The output is related to the `mfconfig`, and the result here is just for example.)