Modelers Contribution Guidelines

View Source On Gitee

Upload a Model to the Modelers Community

Modelers Community is a model hosting platform where users can upload custom models to Magic Music Community for hosting.

MindFormers Built-in Models

If the custom model uses a built-in model provided by MindFormers, i.e. a model whose model code is located under mindformers/models, and no modifications have been made to the model's structure code. You only need to upload the weight file and configuration.

For example, if a user uses MindFormers built-in ChatGLM2 model, performs fine-tuning training, and wants to share the fine-tuned model weights, uploading the model configuration and weights file is sufficient.

Below is sample code that saves the model configuration and weights:

import mindspore as ms
from mindformers import ChatGLM2Config, ChatGLM2ForConditionalGeneration

config = ChatGLM2Config()
model = ChatGLM2ForConditionalGeneration(config)
ms.load_checkpoint("path/model.ckpt", model)  # Load custom weights

model.save_pretrained("./my_model", save_json=True)

The above code runs and saves the config.json file and the mindspore_model.ckpt file (larger weights are automatically split and saved).

After saving, you can use the openmind_hub library for model uploading. See Model Upload.

import openmind_hub

openmind_hub.upload_folder(
    folder_path="/path/to/local/folder",
    repo_id="username/your-model-name",
    token="your-token",
)

An uploaded example can be found in the OpenLlama model of the Modelers community.

Custom Models

If the user has customized model code, you need to upload the model code file at the same time and add a mapping in the json configuration file so that it can be imported through the Auto class.

Naming Rules

Custom code files uploaded to the community generally have uniform naming rules. Assuming the custom model is named model, its code naming should be as follows:

---- model
    |- configuration_model.py  # Config class code files
    |- modeling_model.py       # Model class code files
    |- tokenization_model.py   # Tokenizer code files

Adding auto Mapping

In order for the Auto class to be able to find the user-defined model class when it is used, you need to add the auto mapping in the config.json file. The contents of the additions are as follows:

{
  "auto_map": {
    "AutoConfig": "configuration_model.MyConfig",
    "AutoModel": "modeling_model.MyModel",
    "AutoModelForCausalLM": "modeling_model.MyModelForCausalLM",
  },
}

If there is a custom tokenizer, the tokenizer needs to be saved:

tokenizer.save_pretrained("./my_model", save_json=True)

And add auto mapping to the saved tokenizer_config.json:.

{
  "auto_map": {
    "AutoTokenizer": ["tokenization_model.MyTokenizer", "tokenization_model.MyFastTokenizer"]
  },
}

Uploading the Model

Model uploading can be done using the openmind_hub library. See Model Upload.

import openmind_hub

openmind_hub.upload_folder(
    folder_path="/path/to/local/folder",
    repo_id="username/your-model-name",
    token="your-token",
)

The uploaded example can be found in the Model of the Modelers community.