mindspore_gs.ptq.NetworkHelper
- class mindspore_gs.ptq.NetworkHelper[源代码]
工具类,用于解耦算法层和网络框架层,使算法实现不依赖于具体的框架。
- analysis_decoder_groups(network)[源代码]
分析网络中decoder组的信息。
- 参数:
network (Cell) - 要分析decoder组信息的网络。
样例:
>>> from mindspore import context >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers import LlamaForCausalLM, LlamaConfig >>> from mindformers.tools.register.config import MindFormerConfig >>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config)) >>> helper.analysis_decoder_groups(network)
- assemble_inputs(input_ids: np.ndarray, **kwargs)[源代码]
根据输入的tokens,编译网络推理所需的输入。
- 参数:
input_ids (numpy.ndarray) - 输入的tokens。
kwargs (Dict) - 用于子类可扩展入参。
- 返回:
一个 mindspore.Tensor 的列表,表示用于网络推理的输入。
样例:
>>> import numpy as np >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> input_ids = np.array([[1, 10000]], dtype = np.int32) >>> helper.assemble_inputs(input_ids) (Tensor(shape=[1, 4096], dtype=Int32, value= [[ 1, 10000, 0 ... 0, 0]]), None, None, None, None, None, None, None, None, None, Tensor(shape=[1, 256], dtype=Int32, value= [[ 0, 1, 2 ... 253, 254, 255]]), Tensor(shape=[2], dtype=Int32, value= [0, 1]))
- create_network()[源代码]
创建网络。
- 返回:
创建的网络。
样例:
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = helper.create_network()
- create_tokenizer(**kwargs)[源代码]
获取网络的分词器。
- 参数:
kwargs (Dict) - 用于子类可扩展入参。
- 返回:
一个对象,表示网络分词器。
样例:
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> helper.create_tokenizer() LlamaTokenizer(name_or_path='', vocab_size=32000, model_max_length=100000, added_tokens_decoder={ 0: AddedToken("<unk>", rstrip=False, lstrip=False, normalized=True, special=True), 1: AddedToken("<s>", rstrip=False, lstrip=False, normalized=True, special=True), 2: AddedToken("</s>", rstrip=False, lstrip=False, normalized=True, special=True), })
- generate(network: Cell, input_ids: Union[np.ndarray, List[int], List[List[int]]], max_new_tokens=None, **kwargs)[源代码]
对网络进行自递归式推理,生成一系列tokens。
- 参数:
network (Cell) - 进行自递归生成的网络。
input_ids (numpy.ndarray) - 用于生成的输入tokens。
max_new_tokens (int) - 最长生成长度,默认值为
1
。kwargs (Dict) - 用于子类可扩展入参。
- 返回:
一个列表,表示生成的tokens。
样例:
>>> import numpy as np >>> from mindspore import context >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers import LlamaForCausalLM, LlamaConfig >>> from mindformers.tools.register.config import MindFormerConfig >>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config)) >>> input_ids = np.array([[1, 10000]], dtype = np.int32) >>> helper.generate(network, input_ids) array([[ 1, 10000, 10001]], dtype=int32)
- get_pre_layer(linear_name)[源代码]
通过当前linear层的名称,获取前一层的信息。
- 参数:
linear_name (str) - linear层名称。
- 返回:
一个字典,表示获取到的前一层layer的信息,包含了layer名称、layer和类型。
样例:
>>> from mindspore import context >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers import LlamaForCausalLM, LlamaConfig >>> from mindformers.tools.register.config import MindFormerConfig >>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config)) >>> helper.analysis_decoder_groups(network) >>> helper.get_pre_layer(linear_name)
- get_spec(name: str)[源代码]
获取网络的规格,比如batch_size、seq_length等。
- 参数:
name (str) - 要获取的规格名称。
- 返回:
一个对象,表示获取到的网络规格。
样例:
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> helper.get_spec("batch_size") 1 (The output is related to the `mfconfig`, and the result here is just for example.)