mindformers.models.PretrainedConfig

View Source On Gitee
class mindformers.models.PretrainedConfig(**kwargs)[source]

Base class for all configuration classes. Handles a few parameters common to all models' configurations as well as methods for loading/downloading/saving configurations.

Note

A configuration file can be loaded and saved to disk. Loading the configuration file and using this file to initialize a model does not load the model weights. It only affects the model's configuration.

Parameters
  • name_or_path (str, optional) – Store the string that was passed to mindformers.models.PreTrainedModel.from_pretrained() as pretrained_model_name_or_path if the configuration was created with such a method. Default: "".

  • checkpoint_name_or_path (str, optional) – The path or name of the checkpoint file. Default: None.

  • mindformers_version (str, optional) – The version of MindSpore Transformers. Default: None.

Returns

PretrainedConfig, a PretrainedConfig instance.

Examples

>>> from mindformers.models import LlamaConfig
>>> config = LlamaConfig(num_layers=2, seq_length=1024)
>>> print(config)
LlamaConfig {
    "batch_size": 1,
    "block_size": 16,
    "bos_token_id": 1,
    "checkpoint_name_or_path": "",
    "compute_dtype": "float16",
    "do_sample": true,
    "embedding_init_type": "float16",
    "eos_token_id": 2,
    "extend_method": "None",
    "ffn_dim_multiplier": null,
    "fine_grain_interleave": 1,
    "hidden_size": 4096,
    "ignore_token_id": -100,
    "intermediate_size": null,
    "is_dynamic": false,
    "layernorm_compute_type": "float32",
    "llm_backend": "",
    "max_decode_length": 1024,
    "max_position_embedding": 1024,
    "mindformers_version": "dev",
    "model_type": "llama",
    "multiple_of": 256,
    "n_kv_heads": null,
    "num_blocks": 512,
    "num_heads": 32,
    "num_layers": 2,
    "offset": 0,
    "pad_token_id": 0,
    "parallel_decoding_params": null,
    "parallel_optimizer": false,
    "param_init_type": "float16",
    "pp_interleave_num": 1,
    "qkv_concat": false,
    "qkv_has_bias": false,
    "quant_config": null,
    "repetition_penalty": 1.0,
    "rms_norm_eps": 1e-05,
    "rotary_dtype": "float32",
    "scaling_factor": 1.0,
    "seq_length": 1024,
    "softmax_compute_type": "float32",
    "theta": 10000.0,
    "tie_word_embeddings": false,
    "top_k": 5,
    "top_p": 1.0,
    "use_attn_mask_compression": false,
    "use_flash_attention": false,
    "use_past": false,
    "use_ring_attention": false,
    "use_rope_slice": false,
    "vocab_size": 32000
    }
classmethod from_dict(config_dict: Dict[str, Any], **kwargs)[source]

Instantiates a PretrainedConfig from a Python dictionary of parameters.

Parameters

config_dict (Dict[str, Any]) – Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the mindformers.models.PretrainedConfig.get_config_dict() method.

Returns

PretrainedConfig, the configuration object instantiated from those parameters.

classmethod from_json_file(json_file: Union[str, os.PathLike])[source]

Instantiates a PretrainedConfig from the path to a JSON file of parameters.

Parameters

json_file (Union[str, os.PathLike]) – Path to the JSON file containing the parameters.

Returns

PretrainedConfig, the configuration object instantiated from that JSON file.

classmethod from_pretrained(yaml_name_or_path, **kwargs)[source]

From pretrain method, which instantiates a config by yaml name or path.

Parameters
  • yaml_name_or_path (str) – A supported model name or a path to model config (.yaml), the supported model name could be selected from mindformers.AutoConfig.show_support_list() . If yaml_name_or_path is model name, it supports model names beginning with mindspore or the model name itself, such as "mindspore/vit_base_p16" or "vit_base_p16".

  • pretrained_model_name_or_path (str, optional) – Equal to "yaml_name_or_path", if "pretrained_model_name_or_path" is set, "yaml_name_or_path" is useless. Default: None.

Returns

A model config, which inherited from PretrainedConfig.

classmethod get_config_dict(pretrained_model_name_or_path: Union[str, os.PathLike], **kwargs)[source]

From a pretrained_model_name_or_path, resolve to a dictionary of parameters, to be used for instantiating a PretrainedConfig using mindformers.models.PretrainedConfig.from_dict().

Parameters

pretrained_model_name_or_path (Union[str, os.PathLike]) – The identifier of the pre-trained checkpoint from which we want the dictionary of parameters.

Returns

Tuple[dict, dict], the dictionary(ies) that will be used to instantiate the configuration object.

save_pretrained(save_directory=None, save_name='mindspore_model', **kwargs)[source]

Saves the pre-trained configuration to the specified directory

Parameters
  • save_directory (str, optional) – a directory to save config yaml. Default: None.

  • save_name (str, optional) – the name of save files. Default: "mindspore_model".

to_dict()[source]

Serializes this instance to a Python dictionary.

Returns

dict[str, Any], dictionary of all the attributes that make up this configuration instance.

to_diff_dict()[source]

Removes all attributes from config which correspond to the default config attributes for better readability and serializes to a Python dictionary.

Returns

dict[str, Any], dictionary of all the attributes that make up this configuration instance.

to_json_file(json_file_path: Union[str, os.PathLike], use_diff: bool = True)[source]

Save this instance to a JSON file.

Parameters
  • json_file_path (Union[str, os.PathLike]) – Path to the JSON file in which this configuration instance's parameters will be saved.

  • use_diff (bool, optional) – If set to True, only the difference between the config instance and the default mindformers.models.PretrainedConfig is serialized to JSON file. Default: True.

to_json_string(use_diff: bool = True)[source]

Serializes this instance to a JSON string.

Parameters

use_diff (bool, optional) – If set to True, only the difference between the config instance and the default PretrainedConfig() is serialized to JSON string. Default: True.

Returns

str, string containing all the attributes that make up this configuration instance in JSON format.