mindformers.models.LlamaForCausalLM
- class mindformers.models.LlamaForCausalLM(config: LlamaConfig = None)[source]
Provide llama training loss or logits through network.
- Parameters
config (LlamaConfig, optional) – The config of llama model. Default: None .
- Inputs:
input_ids (Tensor) - the indices of input sequence tokens in the vocabulary with data type Int64/Int32, Tensor of shape \((batch, seq\_length)\).
labels (Tensor, optional) - the labels of inputs with data type Int64/Int32, Tensor of shape \((batch, seq\_length)\) . Default:
None
.input_position (Tensor, optional) - the position ids of inputs (at incremental reasoning mode) which is an increasing sequence with data type Int64/Int32, Tensor \((batch, seq\_length)\). Default:
None
.position_ids (Tensor, optional) - the position ids of inputs which is an increasing sequence with data type Int64/Int32, Tensor \((batch, seq\_length)\). Default:
None
.attention_mask (Tensor, optional) - input sentences padding mask, where 0 indicates padding position with data type Int64/Int32, Tensor of shape \((batch, seq\_length)\). Default:
None
.input_embeds (Tensor, optional) - the embedding of inputs with data type Float32/Float16, Tensor of shape \((batch, seq\_length, hidden\_size)\). Default:
None
.init_reset (Tensor, optional) - A Bool tensor with shape [1], used to clear the past key parameter and past value parameter used in the incremental prediction. Only valid when use_past is True. Tensor of shape \((1)\). Default:
Tensor([True])
.batch_valid_length (Tensor, optional) - Int32 tensor with shape [batch_size] the past calculated the index. Used for incremental prediction when the use_past is True. Default:
None
.batch_index (Tensor, optional) - Discard argument. Will be deleted in the future. Default:
None
.zactivate_len (Tensor, optional) - Discard argument. Will be deleted in the future. Default:
None
.block_tables (Tensor, optional) - Int64 type Tensor, store mapping tables for each sequence. Default:
None
.slot_mapping (Tensor, optional) - Int32 type Tensor, token cache physical slot index. Default:
None
.prefix_keys_values (Tensor, optional) - Discard argument. Will be deleted in the future. Default:
None
.llm_boost_inputs (Tensor, optional) - Discard argument. Will be deleted in the future. Default:
None
.q_seq_lens (Tensor, optional) - In parallel decoding, the query may be flattened. The Paged Attention operator need q_seq_lens to obtain the length information. Default:
None
.
- Outputs:
Tensor. If it is in training mode, the output Tensor contains loss; If it is in prediction mode, the output Tensor contains logits; If it is in evaluation mode, the output Tensor contains logits, tokens, and input masks.
Examples
>>> from mindformers.models.llama import LlamaConfig, LlamaForCausalLM >>> import mindspore as ms >>> ms.set_context(mode=0) >>> config = LlamaConfig(batch_size=2) >>> network = LlamaForCausalLM(config=config) >>> type(network) <class 'mindformers.models.llama.llama.LlamaForCausalLM'> >>> from mindformers import LlamaForCausalLM >>> network = LlamaForCausalLM.from_pretrained('llama2_7b') >>> type(network) <class 'mindformers.models.llama.llama.LlamaForCausalLM'>