Differences with torchtext.data.functional.load_sp_model
torchtext.data.functional.load_sp_model
torchtext.data.functional.load_sp_model(
spm
)
For more information, see torchtext.data.functional.load_sp_model.
mindspore.dataset.text.SentencePieceTokenizer
class mindspore.dataset.text.SentencePieceTokenizer(mode, out_type)
For more information, see mindspore.dataset.text.SentencePieceTokenizer.
Differences
PyTorch: Load a sentencepiece model.
MindSpore: Construct a SentencePiece tokenizer, including load a sentencepiece model.
Categories |
Subcategories |
PyTorch |
MindSpore |
Difference |
---|---|---|---|---|
Parameter |
Parameter1 |
spm |
mode |
MindSpore support SentencePieceVocab object or path of SentencePiece model |
Parameter2 |
- |
out_type |
The output type of tokenizer |
Code Example
from download import download
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/sentencepiece.bpe.model"
download(url, './sentencepiece.bpe.model', replace=True)
# PyTorch
from torchtext.data.functional import load_sp_model
model = load_sp_model("sentencepiece.bpe.model")
# MindSpore
import mindspore.dataset.text as text
model = text.SentencePieceTokenizer("sentencepiece.bpe.model", out_type=text.SPieceTokenizerOutType.STRING)