mindformers.models.multi_modal.ModalContentTransformTemplate
- class mindformers.models.multi_modal.ModalContentTransformTemplate(output_columns: List[str] = None, tokenizer=None, mode='predict', vstack_columns: List[str] = None, modal_content_padding_size=1, max_length=2048, **kwargs)[source]
Base class of modal content transform template. It should be implemented by the specific model. The child class can override the methods build_conversion_input_text, update_result_before_output, batch, post_process to achieve the model's expectations.
- Parameters
output_columns (List[str], optional) – Specify which columns will be output. Default:
None
.tokenizer (Tokenizer, optional) – Build a good model tokenizer. Default:
None
.mode (str) – running mode, predict or train. Default:
predict
.vstack_columns (List[str], optional) – Specify which columns will be vstack when batching data. Default:
None
.modal_content_padding_size (int) – Used in training mode for inherited Template subclasses, it usually represents the maximum number of supported modal contents (such as images) within a training sample. When the number of modal contents in a training sample is less than this value, the modal contents will be expanded to that value.
max_length (int) – Used in training mode, for inherited Template subclasses, it usually represents the maximum length that a training sample can fill in after the content mask is completed after segmentation.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.
Examples
>>> from mindformers.models.multi_modal import ModalContentTransformTemplate >>> ModalContentTransformTemplate().supported_modal [] >>> # Note: >>> # The property of 'supported_modal' should be inherited by subclasses, >>> # and subclasses implement the corresponding modal builders. >>> # The current base class does not support any modal builders, so it returns '[]'.
- batch(data_list, token_padding_length, **kwargs)[source]
Batch the column data in the output_names.
- Parameters
- Returns
A dict. Used to store the batched data.
- abstract build_conversation_input_text(raw_inputs, result_recorder: DataRecord)[source]
Used in predict mode, assemble a conversation based on incoming inputs. Usually inherited and used by quilt class.
- Parameters
raw_inputs (str) – input data.
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
- Returns
Str type. Assembled dialogue.
- build_labels(text_id_list, result_recorder, **kwargs)[source]
Used in training mode, for subclasses to inherit, to construct the labels needed for training from text data.
- Parameters
text_id_list (list) – A list containing text data identifiers or indices.
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.
- build_modal_context(input_ids, result_recorder: DataRecord, **kwargs)[source]
According to the requirements of the modal builder, process the input_ids and finally return the processed input_ids.
- Parameters
input_ids (list) – input data.
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.
- Returns
The processed input_ids.
- get_need_update_output_items(result: DataRecord)[source]
Retrieve the output items that need to be updated.
- Parameters
result (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
- Returns
A Dict. Defaults to an empty dict.
- process_predict_query(query_ele_list: List[Dict], result_recorder: DataRecord)[source]
In predict mode, find the corresponding modal builder by traversing and process it.
- Parameters
query_ele_list (List[dict]) – A list of elements for predicting a request. For example: [{"image":"/path/to/image"}, {"text":"describe image in English"}].
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
- Returns
The text results processed by each modal builder.
- process_train_item(conversation_list: List[List], result_recorder: DataRecord)[source]
In train mode, find the corresponding modal builder by traversing and process it.
- Parameters
conversation_list (List[List]) – A list of elements for dialogue data. For example: [["user", "<img>/path/to/image<img>describe image in English:"], ["assistant", "the image describe …."]]
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
- Returns
The text results processed by each modal builder.
- property supported_modal
Used to return the templates supported of modal builder type by an instance.
- Returns
List type, containing the types of modal builder supported by an instance.