mindformers.models.multi_modal.ModalContentTransformTemplate

class mindformers.models.multi_modal.ModalContentTransformTemplate(output_columns: List[str] = None, tokenizer=None, mode='predict', vstack_columns: List[str] = None, modal_content_padding_size=1, max_length=2048, **kwargs)[source]

Base class of modal content transform template. It should be implemented by the specific model. The child class can override the methods build_conversion_input_text, update_result_before_output, batch, post_process to achieve the model's expectations.

Parameters

output_columns (List[str], optional) – Specify which columns will be output. Default: None.
tokenizer (Tokenizer, optional) – Build a good model tokenizer. Default: None.
mode (str, optional) – running mode, predict or train. Default: predict.
vstack_columns (List[str], optional) – Specify which columns will be vstack when batching data. Default: None.
modal_content_padding_size (int, optional) – Used in training mode for inherited Template subclasses, it usually represents the maximum number of supported modal contents (such as images) within a training sample. When the number of modal contents in a training sample is less than this value, the modal contents will be expanded to that value. Default: 1.
max_length (int, optional) – Used in training mode, for inherited Template subclasses, it usually represents the maximum length that a training sample can fill in after the content mask is completed after segmentation. Default: 2048.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

Examples

>>> from mindformers.models.multi_modal import ModalContentTransformTemplate
>>> ModalContentTransformTemplate().supported_modal
[]
>>> # Note:
>>> #     The property of 'supported_modal' should be inherited by subclasses,
>>> #     and subclasses implement the corresponding modal builders.
>>> #     The current base class does not support any modal builders, so it returns '[]'.

batch(data_list, token_padding_length, **kwargs)[source]

Batch the column data in the output_names.

Parameters

data_list (list) – A list containing multiple data items.
token_padding_length (int) – Used to pad the length of "tokens" to ensure that all text data has the same length.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

Returns

A dict. Used to store the batched data.

abstract build_conversation_input_text(raw_inputs, result_recorder: DataRecord)[source]

Used in predict mode, assemble a conversation based on incoming inputs. Usually inherited and used by quilt class.

Parameters

raw_inputs (str) – input data.
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.

Returns

Str type. Assembled dialogue.

build_labels(text_id_list, result_recorder, **kwargs)[source]

Used in training mode, for subclasses to inherit, to construct the labels needed for training from text data.

Parameters

text_id_list (list) – A list containing text data identifiers or indices.
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

build_modal_context(input_ids, result_recorder: DataRecord, **kwargs)[source]

According to the requirements of the modal builder, process the input_ids and finally return the processed input_ids.

Parameters

input_ids (list) – input data.
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

Returns

The processed input_ids.

get_need_update_output_items(result: DataRecord)[source]

Retrieve the output items that need to be updated.

Parameters: result (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.
Returns: A Dict. Defaults to an empty dict.

post_process(output_ids, **kwargs)[source]

Decode the model's output_ids into text strings.

Parameters

output_ids (list) – A list containing the model's output_ids.
kwargs (dict, optional) – A variable number of keyword parameters reserved for the keyword parameters to be expanded.

Returns

A list containing all decoded text strings.

process_predict_query(query_ele_list: List[Dict], result_recorder: DataRecord)[source]

In predict mode, find the corresponding modal builder by traversing and process it.

Parameters

query_ele_list (List[dict]) – A list of elements for predicting a request. For example: [{"image":"/path/to/image"}, {"text":"describe image in English"}].
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.

Returns

The text results processed by each modal builder.

process_train_item(conversation_list: List[List], result_recorder: DataRecord)[source]

In train mode, find the corresponding modal builder by traversing and process it.

Parameters

conversation_list (List[List]) – A list of elements for dialogue data. For example: [["user", "<img>/path/to/image<img>describe image in English:"], ["assistant", "the image describe …."]]
result_recorder (DataRecord) – The result data recorder is used to save data that needs to be recorded during the inference process. Values are stored by calling the put method of the DataRecord.

Returns

The text results processed by each modal builder.

property supported_modal

Used to return the templates supported of modal builder type by an instance.

Returns: List type, containing the types of modal builder supported by an instance.