mindformers.models.PretrainedConfig
- class mindformers.models.PretrainedConfig(**kwargs)[source]
Base class for all configuration classes. Handles a few parameters common to all models' configurations as well as methods for loading/downloading/saving configurations.
Note
A configuration file can be loaded and saved to disk. Loading the configuration file and using this file to initialize a model does not load the model weights. It only affects the model's configuration.
- Parameters
**kwargs (Any) –
keyword arguments.
name_or_path (str, optional): Store the string that was passed to
mindformers.models.PreTrainedModel.from_pretrained()
as pretrained_model_name_or_path if the configuration was created with such a method. Default:""
.checkpoint_name_or_path (str, optional): The path or name of the checkpoint file. Default:
None
.mindformers_version (str, optional): The version of MindSpore Transformers. Default:
None
.
- Returns
PretrainedConfig, a PretrainedConfig instance.
Examples
>>> from mindformers.models import LlamaConfig >>> config = LlamaConfig(num_layers=2, seq_length=1024) >>> print(config) LlamaConfig { "batch_size": 1, "block_size": 16, "bos_token_id": 1, "checkpoint_name_or_path": "", "compute_dtype": "float16", "do_sample": true, "embedding_init_type": "float16", "eos_token_id": 2, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "hidden_size": 4096, "ignore_token_id": -100, "intermediate_size": null, "is_dynamic": false, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 1024, "max_position_embedding": 1024, "mindformers_version": "dev", "model_type": "llama", "multiple_of": 256, "n_kv_heads": null, "num_blocks": 512, "num_heads": 32, "num_layers": 2, "offset": 0, "pad_token_id": 0, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float16", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": false, "quant_config": null, "repetition_penalty": 1.0, "rms_norm_eps": 1e-05, "rotary_dtype": "float32", "scaling_factor": 1.0, "seq_length": 1024, "softmax_compute_type": "float32", "theta": 10000.0, "tie_word_embeddings": false, "top_k": 5, "top_p": 1.0, "use_attn_mask_compression": false, "use_flash_attention": false, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 32000 }
- classmethod from_dict(config_dict: Dict[str, Any], **kwargs)[source]
Instantiates a PretrainedConfig from a Python dictionary of parameters.
- Parameters
config_dict (Dict[str, Any]) – Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the
mindformers.models.PretrainedConfig.get_config_dict()
method.- Returns
PretrainedConfig, the configuration object instantiated from those parameters.
- classmethod from_json_file(json_file: Union[str, os.PathLike])[source]
Instantiates a PretrainedConfig from the path to a JSON file of parameters.
- Parameters
json_file (Union[str, os.PathLike]) – Path to the JSON file containing the parameters.
- Returns
PretrainedConfig, the configuration object instantiated from that JSON file.
- classmethod from_pretrained(yaml_name_or_path, **kwargs)[source]
From pretrain method, which instantiates a config by yaml name or path.
- Parameters
yaml_name_or_path (str) – A supported model name or a path to model config (.yaml), the supported model name could be selected from
mindformers.AutoConfig.show_support_list()
. If yaml_name_or_path is model name, it supports model names beginning with mindspore or the model name itself, such as "mindspore/vit_base_p16" or "vit_base_p16".**kwargs (Any) –
Keyword arguments.
pretrained_model_name_or_path (str, optional): Equal to "yaml_name_or_path", if "pretrained_model_name_or_path" is set, "yaml_name_or_path" is useless. Default:
None
.
- Returns
A model config, which inherited from PretrainedConfig.
- classmethod get_config_dict(pretrained_model_name_or_path: Union[str, os.PathLike], **kwargs)[source]
From a pretrained_model_name_or_path, resolve to a dictionary of parameters, to be used for instantiating a PretrainedConfig using
mindformers.models.PretrainedConfig.from_dict()
.- Parameters
pretrained_model_name_or_path (Union[str, os.PathLike]) – The identifier of the pre-trained checkpoint from which we want the dictionary of parameters.
- Returns
Tuple[dict, dict], the dictionary(ies) that will be used to instantiate the configuration object.
- save_pretrained(save_directory=None, save_name='mindspore_model', **kwargs)[source]
Saves the pre-trained configuration to the specified directory
- to_dict()[source]
Serializes this instance to a Python dictionary.
- Returns
dict[str, Any], dictionary of all the attributes that make up this configuration instance.
- to_diff_dict()[source]
Removes all attributes from config which correspond to the default config attributes for better readability and serializes to a Python dictionary.
- Returns
dict[str, Any], dictionary of all the attributes that make up this configuration instance.
- to_json_file(json_file_path: Union[str, os.PathLike], use_diff: bool = True)[source]
Save this instance to a JSON file.
- Parameters
json_file_path (Union[str, os.PathLike]) – Path to the JSON file in which this configuration instance's parameters will be saved.
use_diff (bool, optional) – If set to True, only the difference between the config instance and the default
mindformers.models.PretrainedConfig
is serialized to JSON file. Default:True
.
- to_json_string(use_diff: bool = True)[source]
Serializes this instance to a JSON string.
- Parameters
use_diff (bool, optional) – If set to True, only the difference between the config instance and the default PretrainedConfig() is serialized to JSON string. Default:
True
.- Returns
str, string containing all the attributes that make up this configuration instance in JSON format.