Structure
The molecule
submodule provides models that describe molecular topology and conformation.
Topology
To be completed soon.
Molecule
Description
The Molecule model describes particle systems such as atomistic, coarse-grained, and symbolic (e.g. smiles) molecules. Model creation occurs with a kwargs constructor as shown by equivalent operations below:
>>> mol = mmelemental.models.Molecule(
**{
"symbols": ["O", "H", "H"],
"geometry": [2.0, 2.09, 0.0, 2.82, 2.09, 0.58, 1.18, 2.09, 0.58],
"connectivity": [(0, 1, 1.0), (0, 2, 1.0)],
}
)
>>> mol
Molecule(name='H2O', hash='704898f')
In addition, Molecule provides from_data()
and from_file()
methods to create a molecule from data (e.g. MDAnalysis.Universe) or file objects.
The methods to_data()
and to_file()
enable converting a molecule to data and file objects, respectively. See the API for more details.
See the Structure tutorials for more in-depth examples.
API
- class mmelemental.models.struct.molecule.Molecule(**kwargs)
- A representation of a Molecule in MM. This model contains data for symbols, geometry, connectivity, charges,
residues, etc. while also supporting a wide array of I/O and manipulation capabilities. Charges, masses, geometry, and velocities are truncated to 4, 6, 8, and 8 decimal places, respectively, to assist with duplicate detection.
- Parameters
schema_name (ConstrainedStrValue, Default: mmschema_molecule) – The MMSchema specification to which this model conforms. Explicitly fixed as mmschema_molecule.
schema_version (int, Default: 1) – The version number of
schema_name
to which this model conforms.symbols (Array, Optional) – An ordered (natom,) array-like object of particle symbols. The index of this attribute sets the order for all other per-particle setting like
geometry
and the first dimension ofgeometry
.name (str, Optional) – Common or human-readable name to assign to this molecule. This field can be arbitrary.
identifiers (
Identifiers
, Optional) – An optional dictionary of additional identifiers by which this molecule can be referenced, such as INCHI, canonical SMILES, etc. See theIdentifiers
model for more details.comment (str, Optional) – Additional comments for this molecule. Intended for pure human/user consumption and clarity.
ndim (int, Default: 3) – Number of spatial dimensions.
atom_labels (Array, Optional) – Additional per-atom labels as an array of strings. Typical use is in model conversions, such as Elemental <-> Molpro and not typically something which should be user assigned. See the
comments
field for general human-consumable text to affix to the molecule.atomic_numbers (Array, Optional) – An optional ordered 1-D array-like object of atomic numbers of shape (nat,). Index matches the 0-indexed indices of all other per-atom settings like
symbols
andreal
. Values are inferred from thesymbols
list if not explicitly set. Ghostedness should be indicated throughreal
field, not zeros here.mass_numbers (Array, Optional) – An optional ordered 1-D array-like object of atomic mass numbers of shape (nat). Index matches the 0-indexed indices of all other per-atom settings like
symbols
andreal
. Values are inferred from the most common isotopes of thesymbols
list if not explicitly set. If single isotope not (yet) known for an atom, -1 is placeholder.masses (Array, Optional) – The ordered array of particle masses. Index order matches the 0-indexed indices of all other per-atom fields like
symbols
andreal
. If this is not provided, the mass of each atom is inferred from its most common isotope. If this is provided, it must be the same length assymbols
but can acceptNone
entries for standard masses to infer from the same index in thesymbols
field.masses_units (str, Default: unified_atomic_mass_unit) – Units for atomic masses. Defaults to unified atomic mass unit.
molecular_charge (float, Default: 0.0) – The net electrostatic charge of the molecule. Default unit is elementary charge.
molecular_charge_units (str, Default: elementary_charge) – Units for molecular charge. Defaults to elementary charge.
formal_charges (Array, Optional) – Formal charges of all particles/atoms.
formal_charges_units (str, Default: elementary_charge) – Units for formal charges. Defaults to elementary charge.
partial_charges (Array, Optional) – Assigned partial charges of all particles/atoms.
partial_charges_units (str, Default: elementary_charge) – Units for partial charges. Defaults to elementary charge.
geometry (Array, Optional) – An ordered (natom*ndim,) array for XYZ atomic coordinates. Default unit is Angstrom.
geometry_units (str, Default: angstrom) – Units for atomic geometry. Defaults to Angstroms.
velocities (Array, Optional) – An ordered (natoms*ndim,) array for XYZ atomic velocities. Default unit is angstroms/femtoseconds.
velocities_units (str, Default: angstrom / femtosecond) – Units for atomic velocities. Defaults to Angstroms/femtoseconds.
connectivity (Array, Optional) – A list of bonds within the molecule. Each entry is a tuple of
(atom_index_A, atom_index_B, bond_order)
where theatom_index
matches the 0-indexed indices of all other per-atom settings likesymbols
andreal
. Bonds may be freely reordered and inverted.substructs (Array, Optional) – A list of (name, num) of connected atoms constituting the building block (e.g. monomer) of the structure (e.g. a polymer). Order follows atomic indices from 0 till Natoms-1. E.g. [(‘ALA’, 4), …] means atom1 belongs to aminoacid alanine with residue number 4. Substruct name is max 4 characters.
provenance (
Provenance
, Optional) – The provenance information about how this object (and its attributes) were generated, provided, and manipulated.extras (Dict[Any], Optional) – Additional information to bundle with the molecule. Use for schema development and scratch space.
hash (str, Optional) – The hash code that unique identifies this object. Typically not manually assigned but left for MMElemental to handle.
- Return type
None
- classmethod from_data(data=None, dtype=None, **kwargs)
Constructs a Molecule object from a data object.
- Parameters
data (Any, optional) – Data to construct Molecule from such as a data object (e.g. MDAnalysis.Universe) or dict.
dtype (str, optional) – How to interpret the data, if not passed attempts to discover this based on input type.
**kwargs (Optional[Dict[str, Any]], optional) – Additional kwargs to pass to the constructors.
kwargs (Dict[str, Any]) –
- Returns
A Molecule object.
- Return type
Examples
>>> mol = mmelemental.models.Molecule.from_data(universe, dtype="mdanalysis") >>> mol = mmelemental.models.Molecule.from_data(struct, dtype="parmed")
- classmethod from_file(filename, top_filename=None, dtype=None, *, translator=None, **kwargs)
Constructs a Molecule object from a file.
- Parameters
filename (str) – The molecular structure filename to read
top_filename (str, optional) – The topology i.e. connectivity filename to read
dtype (str, optional) – The type of file to interpret. If not set, mmelemental attempts to discover the file type.
translator (Optional[str], optional) – Translator name e.g. mmic_rdkit. Takes precedence over dtype. If unset, MMElemental attempts to find an appropriate translator if it is registered in the
TransComponent
class.**kwargs (Optional[Dict[str, Any]], optional) – Any additional keywords to pass to the constructor
kwargs (Optional[Dict[str, Any]]) –
- Returns
A Molecule object.
- Return type
- get_hash()
Returns the hash of the molecule.
- get_molecular_formula(order='alphabetical')
Returns the molecular formula for a molecule.
- Parameters
order (str, optional) – Sorting order of the formula. Valid choices are “alphabetical” and “hill”.
- Returns
The molecular formula.
- Return type
str
- get_substructs()
Removes duplicate entries in substructs while preserving the order.
- Return type
List
- show(ngl_kwargs=None)
Creates a 3D representation of a moleucle that can be manipulated in Jupyter Notebooks and exported as images (.png).
- Parameters
ngl_kwargs (
Optional
[Dict
[str
,Any
]]) – Addition nglview NGLWidget kwargs.- Returns
An nglview view of the molecule.
- Return type
nglview.NGLWidget
- to_data(dtype=None, *, translator=None, **kwargs)
Converts Molecule to toolkit-specific molecule (e.g. rdkit, MDAnalysis, parmed).
- Parameters
dtype (str, optional) – The type of data object to convert to e.g. mdanalysis, rdkit, parmed, etc.
translator (Optional[str], optional) – Translator name e.g. mmic_rdkit. Takes precedence over dtype. If unset, MMElemental attempts to find an appropriate translator if it is registered in the
TransComponent
class.**kwargs (Optional[Dict[str, Any]], optional) – Additional kwargs to pass to the constructor.
kwargs (Optional[Dict[str, Any]]) –
- Returns
Toolkit-specific molecule model
- Return type
ToolkitModel
- to_file(filename, dtype=None, *, translator=None, **kwargs)
Writes the Molecule to a file.
- Parameters
filename (str) – The filename to write to
dtype (Optional[str], optional) – The type of file to write (e.g. json, pdb, etc.), attempts to infer dtype from file extension if not provided.
translator (Optional[str], optional) – Translator name e.g. mmic_rdkit. Takes precedence over dtype. If unset, MMElemental attempts to find an appropriate translator if it is registered in the
TransComponent
class.**kwargs (Optional[Dict[str, Any]], optional) – Additional kwargs to pass to the constructor.
kwargs (Optional[Dict[str, Any]]) –
- Return type
None