mindspore_gs.ptq.NetworkHelper

View Source On Gitee
class mindspore_gs.ptq.NetworkHelper[source]

NetworkHelper for decoupling algorithm with network framework.

assemble_inputs(input_ids: np.ndarray, **kwargs)[source]

Assemble network inputs for predict from input tokens in numpy ndarray format.

Parameters
  • input_ids (numpy.ndarray) – Input tokens.

  • kwargs (Dict) – Extensible parameter for subclasses.

Returns

A list of mindspore.Tensor as inputs of network predict.

Examples

>>> import numpy as np
>>> from mindspore_gs.ptq import MFLlama2Helper
>>> from mindformers.tools.register.config import MindFormerConfig
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> input_ids = np.array([[1, 10000]], dtype = np.int32)
>>> helper.assemble_inputs(input_ids)
(Tensor(shape=[1, 4096], dtype=Int32, value=
[[    1, 10000,     0 ...     0,     0]]), None, None, None, None, None, None, None, None, None,              Tensor(shape=[1, 256], dtype=Int32, value=
[[  0,   1,   2 ... 253, 254, 255]]), Tensor(shape=[2], dtype=Int32, value= [0, 1]))
create_tokenizer(**kwargs)[source]

Get network tokenizer.

Parameters

kwargs (Dict) – Extensible parameter for subclasses.

Returns

Object as network tokenizer.

Examples

>>> from mindspore_gs.ptq import MFLlama2Helper
>>> from mindformers.tools.register.config import MindFormerConfig
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> helper.create_tokenizer()
LlamaTokenizer(name_or_path='', vocab_size=32000, model_max_length=100000,  added_tokens_decoder={
        0: AddedToken("<unk>", rstrip=False, lstrip=False, normalized=True, special=True),
        1: AddedToken("<s>", rstrip=False, lstrip=False, normalized=True, special=True),
        2: AddedToken("</s>", rstrip=False, lstrip=False, normalized=True, special=True),
}
generate(network: Cell, input_ids: np.ndarray, max_new_tokens=1, **kwargs)[source]

Invoke network and generate tokens.

Parameters
  • network (Cell) – Network to generate tokens.

  • input_ids (numpy.ndarray) – Input tokens for generate.

  • max_new_tokens (int) – Max number of tokens to be generated, default 1.

  • kwargs (Dict) – Extensible parameter for subclasses.

Returns

A list as generated tokens.

Examples

>>> import numpy as np
>>> from mindspore import context
>>> from mindspore_gs.ptq import MFLlama2Helper
>>> from mindformers import LlamaForCausalLM, LlamaConfig
>>> from mindformers.tools.register.config import MindFormerConfig
>>> context.set_context(mode=context.GRAPH_MODE, device_target="Ascend")
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> network = LlamaForCausalLM(LlamaConfig(**mfconfig.model.model_config))
>>> input_ids = np.array([[1, 10000]], dtype = np.int32)
>>> helper.generate(network, input_ids)
array([[    1, 10000, 10001]], dtype=int32)
get_spec(name: str)[source]

Get network specific, such as batch_size, seq_length and so on.

Parameters

name (str) – Name of specific.

Returns

Object as network specific.

Examples

>>> from mindspore_gs.ptq import MFLlama2Helper
>>> from mindformers.tools.register.config import MindFormerConfig
>>> mf_yaml_config_file = "/path/to/mf_yaml_config_file"
>>> mfconfig = MindFormerConfig(mf_yaml_config_file)
>>> helper = MFLlama2Helper(mfconfig)
>>> helper.get_spec("batch_size")
1 (The output is related to the `mfconfig`, and the result here is just for example.)