mindspore.dataset.text.Vectors

class mindspore.dataset.text.Vectors[source]

Pre-trained word embeddings.

classmethod from_file(file_path, max_vectors=None)[source]

Load a pre-training vector set file.

Parameters

file_path (str) – Path to the pre-training vector set file.
max_vectors (int, optional) – The upper limit on the number of pre-trained vectors to load. Most pre-trained vector sets are sorted in the descending order of word frequency. Thus, in situations where the entire set doesn’t fit in memory, or is not needed for another reason, this value can limit the size of the loaded set. Default: None, no upper limit.

Returns

Vectors, pre-training vectors.

Raises

TypeError – If file_path is not of type str.
RuntimeError – If file_path does not exist or is not accessible.
TypeError – If max_vectors is not of type int.
ValueError – If max_vectors is negative.

Examples

>>> import mindspore.dataset.text as text
>>> vector = text.Vectors.from_file("/path/to/vectors/file", max_vectors=None)
>>> to_vectors = text.ToVectors(vector)
>>> # Look up a token into vectors according Vector model.
>>> word_vector = to_vectors(["word1", "word2"])