Structure

The molecule submodule provides models that describe molecular topology and conformation.

Topology

To be completed soon.

Molecule

Description

The Molecule model describes particle systems such as atomistic, coarse-grained, and symbolic (e.g. smiles) molecules. Model creation occurs with a kwargs constructor as shown by equivalent operations below:

>>> mol = mmelemental.models.Molecule(
        **{
            "symbols": ["O", "H", "H"],
            "geometry": [2.0, 2.09, 0.0, 2.82, 2.09, 0.58, 1.18, 2.09, 0.58],
            "connectivity": [(0, 1, 1.0), (0, 2, 1.0)],
        }
    )
>>> mol
 Molecule(name='H2O', hash='704898f')

In addition, Molecule provides from_data() and from_file() methods to create a molecule from data (e.g. MDAnalysis.Universe) or file objects. The methods to_data() and to_file() enable converting a molecule to data and file objects, respectively. See the API for more details. See the Structure tutorials for more in-depth examples.

API

class mmelemental.models.struct.molecule.Molecule(**kwargs)
A representation of a Molecule in MM. This model contains data for symbols, geometry, connectivity, charges,

residues, etc. while also supporting a wide array of I/O and manipulation capabilities. Charges, masses, geometry, and velocities are truncated to 4, 6, 8, and 8 decimal places, respectively, to assist with duplicate detection.

Parameters
  • schema_name (ConstrainedStrValue, Default: mmschema_molecule) – The MMSchema specification to which this model conforms. Explicitly fixed as mmschema_molecule.

  • schema_version (int, Default: 1) – The version number of schema_name to which this model conforms.

  • symbols (Array, Optional) – An ordered (natom,) array-like object of particle symbols. The index of this attribute sets the order for all other per-particle setting like geometry and the first dimension of geometry.

  • name (str, Optional) – Common or human-readable name to assign to this molecule. This field can be arbitrary.

  • identifiers (Identifiers, Optional) – An optional dictionary of additional identifiers by which this molecule can be referenced, such as INCHI, canonical SMILES, etc. See the Identifiers model for more details.

  • comment (str, Optional) – Additional comments for this molecule. Intended for pure human/user consumption and clarity.

  • ndim (int, Default: 3) – Number of spatial dimensions.

  • atom_labels (Array, Optional) – Additional per-atom labels as an array of strings. Typical use is in model conversions, such as Elemental <-> Molpro and not typically something which should be user assigned. See the comments field for general human-consumable text to affix to the molecule.

  • atomic_numbers (Array, Optional) – An optional ordered 1-D array-like object of atomic numbers of shape (nat,). Index matches the 0-indexed indices of all other per-atom settings like symbols and real. Values are inferred from the symbols list if not explicitly set. Ghostedness should be indicated through real field, not zeros here.

  • mass_numbers (Array, Optional) – An optional ordered 1-D array-like object of atomic mass numbers of shape (nat). Index matches the 0-indexed indices of all other per-atom settings like symbols and real. Values are inferred from the most common isotopes of the symbols list if not explicitly set. If single isotope not (yet) known for an atom, -1 is placeholder.

  • masses (Array, Optional) – The ordered array of particle masses. Index order matches the 0-indexed indices of all other per-atom fields like symbols and real. If this is not provided, the mass of each atom is inferred from its most common isotope. If this is provided, it must be the same length as symbols but can accept None entries for standard masses to infer from the same index in the symbols field.

  • masses_units (str, Default: unified_atomic_mass_unit) – Units for atomic masses. Defaults to unified atomic mass unit.

  • molecular_charge (float, Default: 0.0) – The net electrostatic charge of the molecule. Default unit is elementary charge.

  • molecular_charge_units (str, Default: elementary_charge) – Units for molecular charge. Defaults to elementary charge.

  • formal_charges (Array, Optional) – Formal charges of all particles/atoms.

  • formal_charges_units (str, Default: elementary_charge) – Units for formal charges. Defaults to elementary charge.

  • partial_charges (Array, Optional) – Assigned partial charges of all particles/atoms.

  • partial_charges_units (str, Default: elementary_charge) – Units for partial charges. Defaults to elementary charge.

  • geometry (Array, Optional) – An ordered (natom*ndim,) array for XYZ atomic coordinates. Default unit is Angstrom.

  • geometry_units (str, Default: angstrom) – Units for atomic geometry. Defaults to Angstroms.

  • velocities (Array, Optional) – An ordered (natoms*ndim,) array for XYZ atomic velocities. Default unit is angstroms/femtoseconds.

  • velocities_units (str, Default: angstrom / femtosecond) – Units for atomic velocities. Defaults to Angstroms/femtoseconds.

  • connectivity (Array, Optional) – A list of bonds within the molecule. Each entry is a tuple of (atom_index_A, atom_index_B, bond_order) where the atom_index matches the 0-indexed indices of all other per-atom settings like symbols and real. Bonds may be freely reordered and inverted.

  • substructs (Array, Optional) – A list of (name, num) of connected atoms constituting the building block (e.g. monomer) of the structure (e.g. a polymer). Order follows atomic indices from 0 till Natoms-1. E.g. [(‘ALA’, 4), …] means atom1 belongs to aminoacid alanine with residue number 4. Substruct name is max 4 characters.

  • provenance (Provenance, Optional) – The provenance information about how this object (and its attributes) were generated, provided, and manipulated.

  • extras (Dict[Any], Optional) – Additional information to bundle with the molecule. Use for schema development and scratch space.

  • hash (str, Optional) – The hash code that unique identifies this object. Typically not manually assigned but left for MMElemental to handle.

Return type

None

classmethod from_data(data=None, dtype=None, **kwargs)

Constructs a Molecule object from a data object.

Parameters
  • data (Any, optional) – Data to construct Molecule from such as a data object (e.g. MDAnalysis.Universe) or dict.

  • dtype (str, optional) – How to interpret the data, if not passed attempts to discover this based on input type.

  • **kwargs (Optional[Dict[str, Any]], optional) – Additional kwargs to pass to the constructors.

  • kwargs (Dict[str, Any]) –

Returns

A Molecule object.

Return type

Molecule

Examples

>>> mol = mmelemental.models.Molecule.from_data(universe, dtype="mdanalysis")
>>> mol = mmelemental.models.Molecule.from_data(struct, dtype="parmed")
classmethod from_file(filename, top_filename=None, dtype=None, *, translator=None, **kwargs)

Constructs a Molecule object from a file.

Parameters
  • filename (str) – The molecular structure filename to read

  • top_filename (str, optional) – The topology i.e. connectivity filename to read

  • dtype (str, optional) – The type of file to interpret. If not set, mmelemental attempts to discover the file type.

  • translator (Optional[str], optional) – Translator name e.g. mmic_rdkit. Takes precedence over dtype. If unset, MMElemental attempts to find an appropriate translator if it is registered in the TransComponent class.

  • **kwargs (Optional[Dict[str, Any]], optional) – Any additional keywords to pass to the constructor

  • kwargs (Optional[Dict[str, Any]]) –

Returns

A Molecule object.

Return type

Molecule

get_hash()

Returns the hash of the molecule.

get_molecular_formula(order='alphabetical')

Returns the molecular formula for a molecule.

Parameters

order (str, optional) – Sorting order of the formula. Valid choices are “alphabetical” and “hill”.

Returns

The molecular formula.

Return type

str

get_substructs()

Removes duplicate entries in substructs while preserving the order.

Return type

List

show(ngl_kwargs=None)

Creates a 3D representation of a moleucle that can be manipulated in Jupyter Notebooks and exported as images (.png).

Parameters

ngl_kwargs (Optional[Dict[str, Any]]) – Addition nglview NGLWidget kwargs.

Returns

An nglview view of the molecule.

Return type

nglview.NGLWidget

to_data(dtype=None, *, translator=None, **kwargs)

Converts Molecule to toolkit-specific molecule (e.g. rdkit, MDAnalysis, parmed).

Parameters
  • dtype (str, optional) – The type of data object to convert to e.g. mdanalysis, rdkit, parmed, etc.

  • translator (Optional[str], optional) – Translator name e.g. mmic_rdkit. Takes precedence over dtype. If unset, MMElemental attempts to find an appropriate translator if it is registered in the TransComponent class.

  • **kwargs (Optional[Dict[str, Any]], optional) – Additional kwargs to pass to the constructor.

  • kwargs (Optional[Dict[str, Any]]) –

Returns

Toolkit-specific molecule model

Return type

ToolkitModel

to_file(filename, dtype=None, *, translator=None, **kwargs)

Writes the Molecule to a file.

Parameters
  • filename (str) – The filename to write to

  • dtype (Optional[str], optional) – The type of file to write (e.g. json, pdb, etc.), attempts to infer dtype from file extension if not provided.

  • translator (Optional[str], optional) – Translator name e.g. mmic_rdkit. Takes precedence over dtype. If unset, MMElemental attempts to find an appropriate translator if it is registered in the TransComponent class.

  • **kwargs (Optional[Dict[str, Any]], optional) – Additional kwargs to pass to the constructor.

  • kwargs (Optional[Dict[str, Any]]) –

Return type

None