Class NormalizeUTF8
Defined in File text.h
Inheritance Relationships
Base Type
public mindspore::dataset::TensorTransform
(Class TensorTransform)
Class Documentation
-
class NormalizeUTF8 : public mindspore::dataset::TensorTransform
Apply normalize operation to UTF-8 string tensors.
Public Functions
-
explicit NormalizeUTF8(NormalizeForm normalize_form = NormalizeForm::kNfkc)
Constructor.
- Parameters
normalize_form – [in] Valid values can be any of [NormalizeForm::kNone,NormalizeForm::kNfc, NormalizeForm::kNfkc, NormalizeForm::kNfd, NormalizeForm::kNfkd](default=NormalizeForm::kNfkc). See http://unicode.org/reports/tr15/ for details.
NormalizeForm.kNone, remain the input string tensor unchanged.
NormalizeForm.kNfc, normalizes with Normalization Form C.
NormalizeForm.kNfkc, normalizes with Normalization Form KC.
NormalizeForm.kNfd, normalizes with Normalization Form D.
NormalizeForm.kNfkd, normalizes with Normalization Form KD.
Example/* Define operations */ auto normalizeutf8_op = text::NormalizeUTF8(); /* dataset is an instance of Dataset object */ dataset = dataset->Map({normalizeutf8_op}, // operations {"text"}); // input columns
-
~NormalizeUTF8() = default
Destructor.
-
explicit NormalizeUTF8(NormalizeForm normalize_form = NormalizeForm::kNfkc)