mindsponge.common.make_atom14_positions

View Source On Gitee
mindsponge.common.make_atom14_positions(aatype, all_atom_mask, all_atom_positions)[source]

The function of transforming sparse encoding method to densely encoding method.

Total coordinate encoding for atoms in proteins comes in two forms.

  • Sparse encoding, 20 amino acids contain a total of 37 atom types as shown in common.residue_constants.atom_types. So coordinates of atoms in protein can be encoded as a Tensor with shape \((N_{res}, 37, 3)\).

  • Densely encoding. 20 amino acids contain a total of 14 atom types as shown in common.residue_constants.restype_name_to_atom14_names. So coordinates of atoms in protein can be encoded as a Tensor with shape \((N_{res}, 14, 3)\).

Parameters
  • aatype (numpy.ndarray) – Protein sequence encoding. the encoding method refers to common.residue_constants.restype_order. Value range is \([0,20]\). 20 means the amino acid is unknown (UNK).

  • all_atom_mask (numpy.ndarray) – Mask of coordinates of all atoms in proteins. Shape is \((N_{res}, 37)\). If the corresponding position is 0, the amino acid does not contain the atom.

  • all_atom_positions (numpy.ndarray) – Coordinates of all atoms in protein. Shape is \((N_{res}, 37, 3)\) .

Returns

  • numpy.array. Densely encoding, mask of all atoms in protein, including unknown amino acid atoms. Shape is \((N_{res}, 14)\).

  • numpy.array. Densely encoding, mask of all atoms in protein, excluding unknown amino acid atoms. Shape is \((N_{res}, 14)\).

  • numpy.array. Densely encoding, coordinates of all atoms in protein. Shape is \((N_{res}, 14, 3)\).

  • numpy.array. Index of mapping sparse encoding atoms with densely encoding method. Shape is \((N_{res}, 14)\) .

  • numpy.array. Index of mapping densely encoding atoms with sparse encoding method. Shape is \((N_{res}, 37)\) .

  • numpy.array. Sparse encoding, mask of all atoms in protein, including unknown amino acid atoms. Shape is \((N_{res}, 37)\)

  • numpy.array. The atomic coordinates after chiral transformation for the atomic coordinates of densely encoding method. Shape is \((N_{res}, 14, 3)\) .

  • numpy.array. Atom mask after chiral transformation. Shape is \((N_{res}, 14)\) .

  • numpy.array. Atom identifier of the chiral transformation. 1 is transformed and 0 is not transformed. Shape is \((N_{res}, 14)\) .

Symbol:
  • \(N_{res}\) - The number of amino acids in a protein, according to the sequence of the protein.

Supported Platforms:

Ascend GPU

Examples

>>> from mindsponge.common import make_atom14_positions
>>> from mindsponge.common import protein
>>> import numpy as np
>>> pdb_path = "YOUR_PDB_FILE"
>>> with open(pdb_path, 'r', encoding = 'UTF-8') as f:
>>>     prot_pdb = protein.from_pdb_string(f.read())
>>> result = make_atom14_positions(prot_pdb.aatype, prot_pdb.atom_mask.astype(np.float32),
>>>                                prot_pdb.atom_positions.astype(np.float32))
>>> for val in result:
>>>     print(val.shape)
(Nres, 14)
(Nres, 14)
(Nres, 14, 3)
(Nres, 14)
(Nres, 37)
(Nres, 37)
(Nres, 14, 3)
(Nres, 14)
(Nres, 14)