mindspore.dataset.text.RegexReplace

class mindspore.dataset.text.RegexReplace(pattern, replace, replace_all=True)[source]

Replace a part of UTF-8 string tensor with given text according to regular expressions.

See https://unicode-org.github.io/icu/userguide/strings/regexp.html for supported regex pattern.

Note

RegexReplace is not supported on Windows platform yet.

Parameters
  • pattern (str) – the regex expression patterns.

  • replace (str) – the string to replace matched element.

  • replace_all (bool, optional) – If False, only replace first matched element; if True, replace all matched elements. Default: True.

Raises
  • TypeError – If pattern is not of type string.

  • TypeError – If replace is not of type string.

  • TypeError – If replace_all is not of type bool.

Supported Platforms:

CPU

Examples

>>> import mindspore.dataset as ds
>>> import mindspore.dataset.text as text
>>>
>>> pattern = 'Canada'
>>> replace = 'China'
>>> replace_op = text.RegexReplace(pattern, replace)
>>> text_file_list = ["/path/to/text_file_dataset_file"]
>>> text_file_dataset = ds.TextFileDataset(dataset_files=text_file_list)
>>> text_file_dataset = text_file_dataset.map(operations=replace_op)
Tutorial Examples: