mindspore_gs.ptq.NetworkHelper
- class mindspore_gs.ptq.NetworkHelper[source]
NetworkHelper for decoupling algorithm with network framework.
- analysis_decoder_groups(network)[source]
Analyze decoder groups information of network.
- Parameters
network (Cell) – network to analyze decoder groups information.
Examples
>>> from mindspore import context >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers import LlamaForCausalLM, LlamaConfig >>> from mindformers.tools.register.config import MindFormerConfig >>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config)) >>> helper.analysis_decoder_groups(network)
- assemble_inputs(input_ids: np.ndarray, **kwargs)[source]
Assemble network inputs for predict from input tokens in numpy ndarray format.
- Parameters
input_ids (numpy.ndarray) – Input tokens.
kwargs (Dict) – Extensible parameter for subclasses.
- Returns
A list of mindspore.Tensor as inputs of network predict.
Examples
>>> import numpy as np >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> input_ids = np.array([[1, 10000]], dtype = np.int32) >>> helper.assemble_inputs(input_ids) (Tensor(shape=[1, 4096], dtype=Int32, value= [[ 1, 10000, 0 ... 0, 0]]), None, None, None, None, None, None, None, None, None, Tensor(shape=[1, 256], dtype=Int32, value= [[ 0, 1, 2 ... 253, 254, 255]]), Tensor(shape=[2], dtype=Int32, value= [0, 1]))
- create_network()[source]
Create a network.
- Returns
Created network.
Examples
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = helper.create_network()
- create_tokenizer(**kwargs)[source]
Get network tokenizer.
- Parameters
kwargs (Dict) – Extensible parameter for subclasses.
- Returns
Object as network tokenizer.
Examples
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> helper.create_tokenizer() LlamaTokenizer(name_or_path='', vocab_size=32000, model_max_length=100000, added_tokens_decoder={ 0: AddedToken("<unk>", rstrip=False, lstrip=False, normalized=True, special=True), 1: AddedToken("<s>", rstrip=False, lstrip=False, normalized=True, special=True), 2: AddedToken("</s>", rstrip=False, lstrip=False, normalized=True, special=True), })
- generate(network: Cell, input_ids: Union[np.ndarray, List[int], List[List[int]]], max_new_tokens=None, **kwargs)[source]
Invoke network and generate tokens.
- Parameters
network (Cell) – Network to generate tokens.
input_ids (numpy.ndarray) – Input tokens for generate.
max_new_tokens (int) – Max number of tokens to be generated, default 1.
kwargs (Dict) – Extensible parameter for subclasses.
- Returns
A list as generated tokens.
Examples
>>> import numpy as np >>> from mindspore import context >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers import LlamaForCausalLM, LlamaConfig >>> from mindformers.tools.register.config import MindFormerConfig >>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config)) >>> input_ids = np.array([[1, 10000]], dtype = np.int32) >>> helper.generate(network, input_ids) array([[ 1, 10000, 10001]], dtype=int32)
- get_pre_layer(linear_name)[source]
Get pre layer information from current linear_name.
- Parameters
linear_name (str) – linear layer name.
- Returns
A dict of pre layer information which include pre layer name、layer and type.
Examples
>>> from mindspore import context >>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers import LlamaForCausalLM, LlamaConfig >>> from mindformers.tools.register.config import MindFormerConfig >>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config)) >>> helper.analysis_decoder_groups(network) >>> helper.get_pre_layer(linear_name)
- get_spec(name: str)[source]
Get network specific, such as batch_size, seq_length and so on.
- Parameters
name (str) – Name of specific.
- Returns
Object as network specific.
Examples
>>> from mindspore_gs.ptq.network_helpers.mf_net_helpers import MFLlama2Helper >>> from mindformers.tools.register.config import MindFormerConfig >>> mf_yaml_config_file = "/path/to/mf_yaml_config_file" >>> mfconfig = MindFormerConfig(mf_yaml_config_file) >>> helper = MFLlama2Helper(mfconfig) >>> helper.get_spec("batch_size") 1 (The output is related to the `mfconfig`, and the result here is just for example.)