mindspore.nn.Embedding

View Source On Gitee
class mindspore.nn.Embedding(vocab_size, embedding_size, use_one_hot=False, embedding_table='normal', dtype=mstype.float32, padding_idx=None)[source]

A simple lookup table that stores embeddings of a fixed dictionary and size.

This module is often used to store word embeddings and retrieve them using indices. The input to the module is a list of indices, and the output is the corresponding word embeddings.

Note

When ‘use_one_hot’ is set to True, the type of the x must be mindspore.int32.

Parameters
  • vocab_size (int) – Size of the dictionary of embeddings.

  • embedding_size (int) – The size of each embedding vector.

  • use_one_hot (bool) – Specifies whether to apply one_hot encoding form. Default: False .

  • embedding_table (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the embedding_table. Refer to class mindspore.common.initializer for the values of string when a string is specified. Default: 'normal' .

  • dtype (mindspore.dtype) – Data type of x. Default: mstype.float32 .

  • padding_idx (int, None) – When the padding_idx encounters index, the output embedding vector of this index will be initialized to zero. Default: None . The feature is inactivated.

Inputs:
  • x (Tensor) - Tensor of shape \((\text{batch_size}, \text{x_length})\). The elements of the Tensor must be integer and not larger than vocab_size. Otherwise the corresponding embedding vector will be zero. The data type is int32 or int64.

Outputs:

Tensor of shape \((\text{batch_size}, \text{x_length}, \text{embedding_size})\).

Raises
  • TypeError – If vocab_size or embedding_size is not an int.

  • TypeError – If use_one_hot is not a bool.

  • ValueError – If padding_idx is an int which not in range [0, vocab_size).

Supported Platforms:

Ascend GPU CPU

Examples

>>> import mindspore
>>> from mindspore import Tensor, nn
>>> import numpy as np
>>> net = nn.Embedding(20000, 768,  True)
>>> x = Tensor(np.ones([8, 128]), mindspore.int32)
>>> # Maps the input word IDs to word embedding.
>>> output = net(x)
>>> result = output.shape
>>> print(result)
(8, 128, 768)