mindformers.models.PreTrainedModel

View Source On Gitee
class mindformers.models.PreTrainedModel(config: PretrainedConfig, *inputs, **kwargs)[source]

Base class for all models. Takes care of storing the configuration of the models and handles methods for loading, downloading and saving models as well as a few methods common to all models to resize the input embeddings and prune heads in the self-attention heads.

Parameters
  • config (PretrainedConfig) – configuration class for this model architecture.

  • inputs (tuple, optional) – A variable number of position parameters reserved for the position parameters to be expanded.

  • kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

Examples

>>> from mindformers import AutoModel
>>> import mindspore as ms
>>> ms.set_context(mode=0)
>>> network = AutoModel.from_pretrained('llama2_7b')
>>> type(network)
<class 'mindformers.models.llama.llama.LlamaForCausalLM'>
classmethod can_generate()[source]

Returns whether this model can generate sequences with .generate().

Returns

Boolean type, Whether this model can generate sequences with .generate().

classmethod from_pretrained(pretrained_model_name_or_dir: str, *model_args, **kwargs)[source]

Instantiates a model by the pretrained_model_name_or_dir. It download the model weights if the user pass a model name, or load the weight from the given directory if given the path. (only support standalone mode, and distribute mode waits for developing!)

Parameters
  • pretrained_model_name_or_dir (str) – It supports the following two input types. If pretrained_model_name_or_dir is a supported model name, for example, vit_base_p16 and t5_small, it will download the necessary files from the cloud. User can pass one from the support list by call MindFormerBook.get_model_support_list(). If pretrained_model_name_or_dir is a path to the local directory where there should have model weights ended with .ckpt and configuration file ended with yaml.

  • model_args (str, optional) – Model extension parameters. If included "pretrained_model_name_or_path", equal to "pretrained_model_name_or_dir", if "pretrained_model_name_or_path" is set, "pretrained_model_name_or_dir" is useless.

  • kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

Returns

A model, which inherited from PreTrainedModel.

post_init()[source]

A method executed at the end of each Transformer model initialization, to execute code that needs the model's modules properly initialized (such as weight initialization).

classmethod register_for_auto_class(auto_class='AutoModel')[source]

Register this class with a given auto class. This should only be used for custom models as the ones in the library are already mapped with an auto class.

Warning

This API is experimental and may have some slight breaking changes in the next releases.

Parameters

auto_class (Union[str, type], optional) – The auto class to register this new model with. Default: AutoModel.

save_pretrained(save_directory: Union[str, os.PathLike], save_name: str = 'mindspore_model', **kwargs)[source]

Save the model weight and configuration file. (only supports standalone mode, and distribute mode waits for developing)

Parameters
  • save_directory (Union[str, os.PathLike]) – A directory to save the model weight and configuration.

  • save_name (str) – The name of saved files, including model weight and configuration file. Default: mindspore_model.

  • kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

Examples

>>> import os
>>> from mindformers import AutoModel
>>> import mindspore as ms
>>> ms.set_context(mode=0)
>>> net = AutoModel.from_pretrained('llama2_7b')
>>> net.save_pretrained('./checkpoint_save')