mindearth.cell.ViTKNO

View Source On Gitee
class mindearth.cell.ViTKNO(image_size=(128, 256), patch_size=8, in_channels=1, out_channels=1, encoder_embed_dims=768, encoder_depths=16, mlp_ratio=4, dropout_rate=1.0, drop_path_rate=0.0, num_blocks=16, settings='MLP', high_freq=True, encoder_network=False, compute_dtype=mstype.float32)[source]

The ViT-KNO is a deep learning model that based on the Koopman theory and the Vision Transformer structure. This model is based on the Koopman neural operator which mapped the original nonlinear dynamical system to linear dynamical system and conducted the time deduction in linear domain. The details can be found in KoopmanLab: machine learning for solving complex physics equations.

Parameters
  • image_size (tuple[int], optional) – The size of the input image. Default: (128, 256).

  • patch_size (int, optional) – The patch size of image. Default: 8.

  • in_channels (int, optional) – The number of channels in the input space. Default: 1.

  • out_channels (int, optional) – The number of channels in the output space. Default: 1.

  • encoder_depths (int, optional) – The encoder depth of encoder layer. Default: 12.

  • encoder_embed_dims (int, optional) – The encoder embedding dimension of encoder layer. Default: 768.

  • mlp_ratio (int, optional) – The rate of mlp layer. Default: 4.

  • dropout_rate (float, optional) – The rate of dropout layer. Default: 1.0.

  • drop_path_rate (float, optional) – The rate of drop path layer. Default: 0.0.

  • num_blocks (int, optional) – The number of blocks. Default: 16.

  • settings (str, optional) – The construction of first decoder layer. Default: 'MLP'.

  • high_freq (bool, optional) – if high-frequency information complement is applied. Default: True.

  • encoder_network (bool, optional) – if encoder_network is applied. Default: False.

  • compute_dtype (dtype, optional) – The data type for encoder, decoding_embedding, decoder and dense layer. Default: mindspore.float32.

Inputs:
  • x (Tensor) - Tensor of shape \((batch\_size, feature\_size, image\_height, image\_width)\).

Outputs:
  • output (Tensor) - Tensor of shape \((batch\_size, patch\_size, embed\_dim)\). where \(patch\_size = (image\_height * image\_width) / (patch\_size * patch\_size)\).

Supported Platforms:

Ascend GPU

Examples

>>> import numpy as np
>>> from mindspore.common.initializer import initializer, Normal
>>> from mindearth.cell import ViTKNO
>>> B, C, H, W = 16, 20, 128, 256
>>> input_ = initializer(Normal(), [B, C, H, W])
>>> net = ViTKNO(image_size=(H, W), in_channels=C, out_channels=C, compute_dtype=dtype.float32)
>>> output, _ = net(input_)
>>> print(output.shape)
(16, 128, 5120)