Document feedback

Question document fragment

When a question document fragment contains a formula, it is displayed as a space.

Submission type

issue

It's a little complicated...

I'd like to ask someone.

PR

Just a small problem.

I can fix it online!

Please select the submission type

Problem type

Specifications and Common Mistakes

- Specifications and Common Mistakes:

- Misspellings or punctuation mistakes,incorrect formulas, abnormal display.

- Incorrect links, empty cells, or wrong formats.

- Chinese characters in English context.

- Minor inconsistencies between the UI and descriptions.

- Low writing fluency that does not affect understanding.

- Incorrect version numbers, including software package names and version numbers on the UI.

Usability

- Usability:

- Incorrect or missing key steps.

- Missing main function descriptions, keyword explanation, necessary prerequisites, or precautions.

- Ambiguous descriptions, unclear reference, or contradictory context.

- Unclear logic, such as missing classifications, items, and steps.

Correctness

- Correctness:

- Technical principles, function descriptions, supported platforms, parameter types, or exceptions inconsistent with that of software implementation.

- Incorrect schematic or architecture diagrams.

- Incorrect commands or command parameters.

- Incorrect code.

- Commands inconsistent with the functions.

- Wrong screenshots.

- Sample code running error, or running results inconsistent with the expectation.

Risk Warnings

- Risk Warnings:

- Lack of risk warnings for operations that may damage the system or important data.

Content Compliance

- Content Compliance:

- Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions.

- Copyright infringement.

Please select the type of question

Problem description

Describe the bug so that we can quickly locate the problem.

Document feedback

Introduction || Quick Start || Tensor || Dataset || Transforms || Model || Autograd || Train || Save and Load

Tensor

Tensor is a multilinear function that can be used to represent linear relationships between vectors, scalars, and other tensors. The basic examples of these linear relations are the inner product, the outer product, the linear map, and the Cartesian product. In the $n$ dimensional space, its coordinates have $n^{r}$ components. Each component is a function of coordinates, and these components are also linearly transformed according to certain rules when the coordinates are transformed. $r$ is called the rank or order of this tensor (not related to the rank or order of the matrix).

A tensor is a special data structure that is similar to arrays and matrices. Tensor is the basic data structure in MindSpore network operations. This tutorial describes the attributes and usage of tensors and sparse tensors.

import numpy as np
import mindspore
from mindspore import ops
from mindspore import Tensor, CSRTensor, COOTensor

Creating a Tensor

There are multiple methods for creating tensors. When building a tensor, you can pass the Tensor, float, int, bool, tuple, list, and numpy.ndarray types.

Generating a tensor based on data

You can create a tensor based on data. The data type can be set or automatically inferred by the framework.
```
data = [1, 0, 1, 0]
x_data = Tensor(data)
```
Generating a tensor from the NumPy array

You can create a tensor from the NumPy array.
```
np_array = np.array(data)
x_np = Tensor(np_array)
```

Generating a tensor by using init

When init is used to initialize a tensor, the init, shape, and dtype parameters can be transferred.

init: supports the subclass of initializer.
shape: supports list, tuple, and int.
dtype: supports mindspore.dtype.

from mindspore.common.initializer import One, Normal

# Initialize a tensor with ones
tensor1 = mindspore.Tensor(shape=(2, 2), dtype=mindspore.float32, init=One())
# Initialize a tensor from normal distribution
tensor2 = mindspore.Tensor(shape=(2, 2), dtype=mindspore.float32, init=Normal())

print("tensor1:\n", tensor1)
print("tensor2:\n", tensor2)

tensor1:
 [[1. 1.]
 [1. 1.]]
tensor2:
 [[-0.00063482 -0.00916224]
 [ 0.01324238 -0.0171206 ]]

The init is used for delayed initialization in parallel mode. Usually, it is not recommended to use init interface to initialize parameters.

Inheriting attributes of another tensor to form a new tensor

from mindspore import ops

x_ones = ops.ones_like(x_data)
print(f"Ones Tensor: \n {x_ones} \n")

x_zeros = ops.zeros_like(x_data)
print(f"Zeros Tensor: \n {x_zeros} \n")

Ones Tensor:
 [1 1 1 1]

Zeros Tensor:
 [0 0 0 0]

Tensor Attributes

Tensor attributes include shape, data type, transposed tensor, item size, number of bytes occupied, dimension, size of elements, and stride per dimension.

shape: the shape of Tensor, a tuple.
dtype: the dtype of Tensor, a data type of MindSpore.
itemsize: the number of bytes occupied by each element in Tensor, which is an integer.
nbytes: the total number of bytes occupied by Tensor, which is an integer.
ndim: rank of Tensor, that is, len(tensor.shape), which is an integer.
size: the number of all elements in Tensor, which is an integer.
strides: the number of bytes to traverse in each dimension of Tensor, which is a tuple.

x = Tensor(np.array([[1, 2], [3, 4]]), mindspore.int32)

print("x_shape:", x.shape)
print("x_dtype:", x.dtype)
print("x_itemsize:", x.itemsize)
print("x_nbytes:", x.nbytes)
print("x_ndim:", x.ndim)
print("x_size:", x.size)
print("x_strides:", x.strides)

x_shape: (2, 2)
x_dtype: Int32
x_itemsize: 4
x_nbytes: 16
x_ndim: 2
x_size: 4
x_strides: (8, 4)

Tensor Indexing

Tensor indexing is similar to NumPy indexing. Indexing starts from 0, negative indexing means indexing in reverse order, and colons : and ... are used for slicing.

tensor = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))

print("First row: {}".format(tensor[0]))
print("value of bottom right corner: {}".format(tensor[1, 1]))
print("Last column: {}".format(tensor[:, -1]))
print("First column: {}".format(tensor[..., 0]))

First row: [0. 1.]
value of bottom right corner: 3.0
Last column: [1. 3.]
First column: [0. 2.]

Tensor Operation

There are many operations between tensors, including arithmetic, linear algebra, matrix processing (transposing, indexing, and slicing), and sampling. The usage of tensor operation is similar to that of NumPy. The following describes several operations.

Common arithmetic operations include: addition (+), subtraction (-), multiplication (*), division (/), modulo (%), and exact division (//).

x = Tensor(np.array([1, 2, 3]), mindspore.float32)
y = Tensor(np.array([4, 5, 6]), mindspore.float32)

output_add = x + y
output_sub = x - y
output_mul = x * y
output_div = y / x
output_mod = y % x
output_floordiv = y // x

print("add:", output_add)
print("sub:", output_sub)
print("mul:", output_mul)
print("div:", output_div)
print("mod:", output_mod)
print("floordiv:", output_floordiv)

add: [5. 7. 9.]
sub: [-3. -3. -3.]
mul: [ 4. 10. 18.]
div: [4.  2.5 2. ]
mod: [0. 1. 0.]
floordiv: [4. 2. 2.]

Concat connects a series of tensors in a given dimension.

data1 = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))
data2 = Tensor(np.array([[4, 5], [6, 7]]).astype(np.float32))
output = ops.concat((data1, data2), axis=0)

print(output)
print("shape:\n", output.shape)

[[0. 1.]
 [2. 3.]
 [4. 5.]
 [6. 7.]]
shape:
 (4, 2)

Stack combines two tensors from another dimension.

data1 = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))
data2 = Tensor(np.array([[4, 5], [6, 7]]).astype(np.float32))
output = ops.stack([data1, data2])

print(output)
print("shape:\n", output.shape)

[[[0. 1.]
  [2. 3.]]

 [[4. 5.]
  [6. 7.]]]
shape:
 (2, 2, 2)

Conversion Between Tensor and NumPy

Tensor and NumPy can be converted to each other.

Tensor to NumPy

Use asnumpy() to convert Tensor to NumPy, which is same as tensor building.

t = ops.ones(5, mindspore.float32)
print(f"t: {t}")
n = t.asnumpy()
print(f"n: {n}")

t: [1. 1. 1. 1. 1.]
n: [1. 1. 1. 1. 1.]

NumPy to Tensor

Use Tensor() to convert NumPy to Tensor.

n = np.ones(5)
t = Tensor.from_numpy(n)

np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

t: [2. 2. 2. 2. 2.]
n: [2. 2. 2. 2. 2.]

Sparse Tensor

A sparse tensor is a special tensor in which the value of the most significant element is zero.

In some scenarios (such as recommendation systems, molecular dynamics, graph neural networks), the data is sparse. If you use common dense tensors to represent the data, you may introduce many unnecessary calculations, storage, and communication costs. In this case, it is better to use sparse tensor to represent the data.

MindSpore now supports the two most commonly used CSR and COO sparse data formats.

The common structure of the sparse tensor is <indices:Tensor, values:Tensor, shape:Tensor>. indices means the indexes of non-zero elements, values means the values of non-zero elements, and shape means the dense shape of the sparse tensor. In this structure, we define data structure CSRTensor, COOTensor, and RowTensor.

CSRTensor

The compressed sparse row (CSR) is efficient in both storage and computation. All the non-zero values are stored in values, and their positions are stored in indptr (row) and indices (column). The meaning of each parameter is as follows:

indptr: 1-D integer tensor, indicating the start and end points of the non-zero elements in each row of the sparse data in values. The index data type can be int16, int32, or int64.
indices: 1-D integer tensor, indicating the position of the sparse tensor non-zero elements in the column and has the same length as values. The index data type can be int16, int32, or int64.
values: 1-D tensor, indicating that the value of the non-zero element corresponding to the CSRTensor and has the same length as indices.
shape: indicates the shape of a compressed sparse tensor. The data type is Tuple. Currently, only 2-D CSRTensor is supported.

For details about CSRTensor, see mindspore.CSRTensor.

The following are some examples of using the CSRTensor:

indptr = Tensor([0, 1, 2])
indices = Tensor([0, 1])
values = Tensor([1, 2], dtype=mindspore.float32)
shape = (2, 4)

# Make a CSRTensor
csr_tensor = CSRTensor(indptr, indices, values, shape)

print(csr_tensor.astype(mindspore.float64).dtype)

Float64

The above code generates a CSRTensor as shown in the following equation:

\begin{array}{r} [\begin{array}{c} 1 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 \end{array}] \end{array}

COOTensor

The COO (Coordinate Format) sparse tensor format is used to represent a collection of nonzero elements of a tensor on a given index. If the number of non-zero elements is N and the dimension of the compressed tensor is ndims. The meaning of each parameter is as follows:

indices: 2-D integer tensor. Each row indicates a non-zero element subscript. Shape: [N, ndims]. The index data type can be int16, int32, or int64.
values: 1-D tensor of any type, indicating the value of the non-zero element. Shape: [N].
shape: indicates the shape of a compressed sparse tensor. Currently, only 2-D COOTensor is supported.

For details about COOTensor, see mindspore.COOTensor.

The following are some examples of using COOTensor:

indices = Tensor([[0, 1], [1, 2]], dtype=mindspore.int32)
values = Tensor([1, 2], dtype=mindspore.float32)
shape = (3, 4)

# Make a COOTensor
coo_tensor = COOTensor(indices, values, shape)

print(coo_tensor.values)
print(coo_tensor.indices)
print(coo_tensor.shape)
print(coo_tensor.astype(mindspore.float64).dtype)  # COOTensor to float64

[1. 2.]
[[0 1]
 [1 2]]
(3, 4)
Float64

The preceding code generates COOTensor as follows:

\begin{array}{r} [\begin{array}{c} 0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 \end{array}] \end{array}