Modelers Contribution Guidelines

Upload a Model to the Modelers Community

Modelers Community is a model hosting platform where users can upload custom models to Magic Music Community for hosting.

MindFormers Built-in Models

If the custom model uses a built-in model provided by MindFormers, i.e. a model whose model code is located under mindformers/models, and no modifications have been made to the model's structure code. You only need to upload the weight file and configuration.

For example, if a user uses MindFormers built-in ChatGLM2 model, performs fine-tuning training, and wants to share the fine-tuned model weights, uploading the model configuration and weights file is sufficient.

Below is sample code that saves the model configuration and weights:

import mindspore as ms
from mindformers import ChatGLM2Config, ChatGLM2ForConditionalGeneration

config = ChatGLM2Config()
model = ChatGLM2ForConditionalGeneration(config)
ms.load_checkpoint("path/model.ckpt", model)  # Load custom weights

model.save_pretrained("./my_model", save_json=True)

The above code runs and saves the config.json file and the mindspore_model.ckpt file (larger weights are automatically split and saved).

After saving, you can use the openmind_hub library for model uploading. See Model Upload.

import openmind_hub

openmind_hub.upload_folder(
    folder_path="/path/to/local/folder",
    repo_id="username/your-model-name",
    token="your-token",
)

An uploaded example can be found in the OpenLlama model of the Modelers community.

Custom Models

If the user has customized model code, you need to upload the model code file at the same time and add a mapping in the json configuration file so that it can be imported through the Auto class.

Naming Rules

Custom code files uploaded to the community generally have uniform naming rules. Assuming the custom model is named model, its code naming should be as follows:

---- model
    |- configuration_model.py  # Config class code files
    |- modeling_model.py       # Model class code files
    |- tokenization_model.py   # Tokenizer code files

Adding auto Mapping

In order for the Auto class to be able to find the user-defined model class when it is used, you need to add the auto mapping in the config.json file. The contents of the additions are as follows:

{
  "auto_map": {
    "AutoConfig": "configuration_model.MyConfig",
    "AutoModel": "modeling_model.MyModel",
    "AutoModelForCausalLM": "modeling_model.MyModelForCausalLM",
  },
}

If there is a custom tokenizer, the tokenizer needs to be saved:

tokenizer.save_pretrained("./my_model", save_json=True)

And add auto mapping to the saved tokenizer_config.json:.

{
  "auto_map": {
    "AutoTokenizer": ["tokenization_model.MyTokenizer", "tokenization_model.MyFastTokenizer"]
  },
}

Uploading the Model

Model uploading can be done using the openmind_hub library. See Model Upload.

import openmind_hub

openmind_hub.upload_folder(
    folder_path="/path/to/local/folder",
    repo_id="username/your-model-name",
    token="your-token",
)

The uploaded example can be found in the Model of the Modelers community.