mindspore.dataset.text.ToVectors

View Source On Gitee
class mindspore.dataset.text.ToVectors(vectors, unk_init=None, lower_case_backup=False)[source]

Look up a token into vectors according to the input vector table.

Parameters
  • vectors (Vectors) – A vectors object.

  • unk_init (sequence, optional) – Sequence used to initialize out-of-vectors (OOV) token. Default: None, initialize with zero vectors.

  • lower_case_backup (bool, optional) – Whether to look up the token in the lower case. If False, each token in the original case will be looked up; if True, each token in the original case will be looked up first, if not found in the keys of the property stoi, the token in the lower case will be looked up. Default: False.

Raises
  • TypeError – If unk_init is not of type sequence.

  • TypeError – If elements of unk_init is not of type float or int.

  • TypeError – If lower_case_backup is not of type bool.

Supported Platforms:

CPU

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.text as text
>>>
>>> # Load vectors from file
>>> vectors = text.Vectors.from_file("/path/to/vectors/file")
>>> # Use ToVectors operation to map tokens to vectors
>>> to_vectors = text.ToVectors(vectors)
>>>
>>> text_file_list = ["/path/to/text_file_dataset_file"]
>>> text_file_dataset = ds.TextFileDataset(dataset_files=text_file_list)
>>> text_file_dataset = text_file_dataset.map(operations=[to_vectors])
Tutorial Examples: