mindspore.dataset.vision.read_video

mindspore.dataset.vision.read_video(filename, start_pts=0, end_pts=None, pts_unit='pts')[source]

Read the video, audio, metadata from a video file.

It supports AVI, H264, H265, MOV, MP4, WMV file formats.

Parameters
  • filename (str) – The path to the video file to be read.

  • start_pts (Union[float, Fraction, int], optional) – The start presentation timestamp of the video. Default: 0.

  • end_pts (Union[float, Fraction, int], optional) – The end presentation timestamp of the video. Default: None. The None is represented by 2147483647.

  • pts_unit (str, optional) – The unit of the timestamps. It can be any of ["pts", "sec"]. Default: "pts".

Returns

  • numpy.ndarray, four dimensions uint8 data for video. The format is [T, H, W, C]. T is the number of frames, H is the height, W is the width, C is the channel for RGB.

  • numpy.ndarray, two dimensions float for audio. The format is [C, L]. C is the number of channels. L is the length of the points in one channel.

  • dict, metadata for the video and audio. It contains video_fps data of type float and audio_fps data of type int.

Raises
  • TypeError – If filename is not of type str.

  • TypeError – If start_pts is not of type [float, Fraction, int].

  • TypeError – If end_pts is not of type [float, Fraction, int].

  • TypeError – If pts_unit is not of type str.

  • RuntimeError – If filename does not exist, or not a regular file, or not a supported video file.

  • ValueError – If start_pts is less than 0.

  • ValueError – If end_pts is less than start_pts.

  • ValueError – If pts_unit is not in ["pts", "sec"].

Supported Platforms:

CPU

Examples

>>> import mindspore.dataset.vision as vision
>>> video_output, audio_output, metadata_output = vision.read_video("/path/to/file")