Molecule
A Python implementation of the MolSSI QCSchema Molecule
object.
There are many definitions of Molecule
depending on the domain; this particular
Molecule
is an immutable 3D Cartesian representation with support for
quantum chemistry constructs.
Creation
A Molecule can be created using the normal kwargs fashion as shown below:
>>> mol = qcel.models.Molecule(**{"symbols": ["He"], "geometry": [0, 0, 0]})
In addition, there is the from_data
attribute to create a molecule from standard strings:
>>> mol = qcel.models.Molecule.from_data("He 0 0 0")
>>> mol
< Geometry (in Angstrom), charge = 0.0, multiplicity = 1:
Center X Y Z
------------ ----------------- ----------------- -----------------
He 0.000000000000 0.000000000000 0.000000000000
>
Identifiers
A number of unique identifiers are automatically created for each molecule. Additional implementation such as InChI and SMILES are actively being looked into.
Molecular Hash
A molecule hash is automatically created to allow each molecule to be uniquely identified. The following keys are used to generate the hash:
symbols
masses
(1.e-6 tolerance)molecular_charge
(1.e-4 tolerance)molecular_multiplicity
real
geometry
(1.e-8 tolerance)fragments
fragment_charges
(1.e-4 tolerance)fragment_multiplicities
connectivity
Hashes can be acquired from any molecule object and a FractalServer
automatically generates canonical hashes when a molecule is added to the
database.
>>> mol = qcel.models.Molecule(**{"symbols": ["He", "He"], "geometry": [0, 0, -3, 0, 0, 3]})
>>> mol.get_hash()
'84872f975d19aafa62b188b40fbadaf26a3b1f84'
Molecular Formula
The molecular formula is also available sorted in alphabetical order with title case symbol names. Any symbol with a count of one does not have a number associated with it.
>>> mol.get_molecular_formula()
'He2'
Fragments
A Molecule with fragments can be created either using the --
separators in
the from_data
function or by passing explicit fragments in the
Molecule
constructor:
>>> mol = qcel.models.Molecule.from_data(
>>> """
>>> Ne 0.000000 0.000000 0.000000
>>> --
>>> Ne 3.100000 0.000000 0.000000
>>> units au
>>> """)
>>> mol = qcel.models.Molecule(
>>> geometry=[0, 0, 0, 3.1, 0, 0],
>>> symbols=["Ne", "Ne"],
>>> fragments=[[0], [1]]
>>> )
Fragments from a molecule containing fragment information can be acquired by:
>>> mol.get_fragment(0)
< Geometry (in Angstrom), charge = 0.0, multiplicity = 1:
Center X Y Z
------------ ----------------- ----------------- -----------------
Ne 0.000000000000 0.000000000000 0.000000000000
>
Obtaining fragments with ghost atoms is also supported:
>>> mol.get_fragment(0, 1)
< Geometry (in Angstrom), charge = 0.0, multiplicity = 1:
Center X Y Z
------------ ----------------- ----------------- -----------------
Ne 0.000000000000 0.000000000000 0.000000000000
Ne (Gh) 3.100000000572 0.000000000000 0.000000000000
>
API
- pydantic model qcelemental.models.Molecule[source]
The physical Cartesian representation of the molecular system.
A QCSchema representation of a Molecule. This model contains data for symbols, geometry, connectivity, charges, fragmentation, etc while also supporting a wide array of I/O and manipulation capabilities.
Molecule objects geometry, masses, and charges are truncated to 8, 6, and 4 decimal places respectively to assist with duplicate detection.
Notes
All arrays are stored flat but must be reshapable into the dimensions in attribute
shape
, with abbreviations as follows:nat: number of atomic = calcinfo_natom
nfr: number of fragments
<varies>: irregular dimension not systematically reshapable
Show JSON schema
{ "title": "Molecule", "description": "The physical Cartesian representation of the molecular system.\n\nA QCSchema representation of a Molecule. This model contains\ndata for symbols, geometry, connectivity, charges, fragmentation, etc while also supporting a wide array of I/O and manipulation capabilities.\n\nMolecule objects geometry, masses, and charges are truncated to 8, 6, and 4 decimal places respectively to assist with duplicate detection.\n\nNotes\n-----\nAll arrays are stored flat but must be reshapable into the dimensions in attribute ``shape``, with abbreviations as follows:\n\n * nat: number of atomic = calcinfo_natom\n * nfr: number of fragments\n * <varies>: irregular dimension not systematically reshapable", "type": "object", "properties": { "schema_name": { "title": "Schema Name", "description": "The QCSchema specification to which this model conforms. Explicitly fixed as qcschema_molecule.", "default": "qcschema_molecule", "pattern": "^(qcschema_molecule)$", "type": "string" }, "schema_version": { "title": "Schema Version", "description": "The version number of :attr:`~qcelemental.models.Molecule.schema_name` to which this model conforms.", "default": 2, "type": "integer" }, "validated": { "title": "Validated", "description": "A boolean indicator (for speed purposes) that the input Molecule data has been previously checked for schema (data layout and type) and physics (e.g., non-overlapping atoms, feasible multiplicity) compliance. This should be False in most cases. A ``True`` setting should only ever be set by the constructor for this class itself or other trusted sources such as a Fractal Server or previously serialized Molecules.", "default": false, "type": "boolean" }, "symbols": { "title": "Symbols", "description": "The ordered array of atomic elemental symbols in title case. This field's index sets atomic order for all other per-atom fields like :attr:`~qcelemental.models.Molecule.real` and the first dimension of :attr:`~qcelemental.models.Molecule.geometry`. Ghost/virtual atoms must have an entry here in :attr:`~qcelemental.models.Molecule.symbols`; ghostedness is indicated through the :attr:`~qcelemental.models.Molecule.real` field.", "shape": [ "nat" ], "type": "array", "items": { "type": "string" } }, "geometry": { "title": "Geometry", "description": "The ordered array for Cartesian XYZ atomic coordinates [a0]. Atom ordering is fixed; that is, a consumer who shuffles atoms must not reattach the input (pre-shuffling) molecule schema instance to any output (post-shuffling) per-atom results (e.g., gradient). Index of the first dimension matches the 0-indexed indices of all other per-atom settings like :attr:`~qcelemental.models.Molecule.symbols` and :attr:`~qcelemental.models.Molecule.real`.\nSerialized storage is always flat, (3*nat,), but QCSchema implementations will want to reshape it. QCElemental can also accept array-likes which can be mapped to (nat,3) such as a 1-D list of length 3*nat, or the serialized version of the array in (3*nat,) shape; all forms will be reshaped to (nat,3) for this attribute.", "shape": [ "nat", 3 ], "units": "a0", "type": "array", "items": { "type": "number" } }, "name": { "title": "Name", "description": "Common or human-readable name to assign to this molecule. This field can be arbitrary; see :attr:`~qcelemental.models.Molecule.identifiers` for well-defined labels.", "type": "string" }, "identifiers": { "title": "Identifiers", "description": "An optional dictionary of additional identifiers by which this molecule can be referenced, such as INCHI, canonical SMILES, etc. See the :class:`~qcelemental.models.results.Identifiers` model for more details.", "allOf": [ { "$ref": "#/definitions/Identifiers" } ] }, "comment": { "title": "Comment", "description": "Additional comments for this molecule. Intended for pure human/user consumption and clarity.", "type": "string" }, "molecular_charge": { "title": "Molecular Charge", "description": "The net electrostatic charge of the molecule.", "default": 0.0, "type": "number" }, "molecular_multiplicity": { "title": "Molecular Multiplicity", "description": "The total multiplicity of the molecule.", "default": 1, "type": "number" }, "masses": { "title": "Masses", "description": "The ordered array of atomic masses. Index order matches the 0-indexed indices of all other per-atom fields like :attr:`~qcelemental.models.Molecule.symbols` and :attr:`~qcelemental.models.Molecule.real`. If this is not provided, the mass of each atom is inferred from its most common isotope. If this is provided, it must be the same length as :attr:`~qcelemental.models.Molecule.symbols` but can accept ``None`` entries for standard masses to infer from the same index in the :attr:`~qcelemental.models.Molecule.symbols` field.", "shape": [ "nat" ], "units": "u", "type": "array", "items": { "type": "number" } }, "real": { "title": "Real", "description": "The ordered array indicating if each atom is real (``True``) or ghost/virtual (``False``). Index matches the 0-indexed indices of all other per-atom settings like :attr:`~qcelemental.models.Molecule.symbols` and the first dimension of :attr:`~qcelemental.models.Molecule.geometry`. If this is not provided, all atoms are assumed to be real (``True``).If this is provided, the reality or ghostedness of every atom must be specified.", "shape": [ "nat" ], "type": "array", "items": { "type": "boolean" } }, "atom_labels": { "title": "Atom Labels", "description": "Additional per-atom labels as an array of strings. Typical use is in model conversions, such as Elemental <-> Molpro and not typically something which should be user assigned. See the :attr:`~qcelemental.models.Molecule.comment` field for general human-consumable text to affix to the molecule.", "shape": [ "nat" ], "type": "array", "items": { "type": "string" } }, "atomic_numbers": { "title": "Atomic Numbers", "description": "An optional ordered 1-D array-like object of atomic numbers of shape (nat,). Index matches the 0-indexed indices of all other per-atom settings like :attr:`~qcelemental.models.Molecule.symbols` and :attr:`~qcelemental.models.Molecule.real`. Values are inferred from the :attr:`~qcelemental.models.Molecule.symbols` list if not explicitly set. Ghostedness should be indicated through :attr:`~qcelemental.models.Molecule.real` field, not zeros here.", "shape": [ "nat" ], "type": "array", "items": { "type": "number", "multipleOf": 1.0 } }, "mass_numbers": { "title": "Mass Numbers", "description": "An optional ordered 1-D array-like object of atomic *mass* numbers of shape (nat). Index matches the 0-indexed indices of all other per-atom settings like :attr:`~qcelemental.models.Molecule.symbols` and :attr:`~qcelemental.models.Molecule.real`. Values are inferred from the most common isotopes of the :attr:`~qcelemental.models.Molecule.symbols` list if not explicitly set. If single isotope not (yet) known for an atom, -1 is placeholder.", "shape": [ "nat" ], "type": "array", "items": { "type": "number", "multipleOf": 1.0 } }, "connectivity": { "title": "Connectivity", "description": "A list of bonds within the molecule. Each entry is a tuple of ``(atom_index_A, atom_index_B, bond_order)`` where the ``atom_index`` matches the 0-indexed indices of all other per-atom settings like :attr:`~qcelemental.models.Molecule.symbols` and :attr:`~qcelemental.models.Molecule.real`. Bonds may be freely reordered and inverted.", "minItems": 1, "type": "array", "items": { "type": "array", "minItems": 3, "maxItems": 3, "items": [ { "type": "integer", "minimum": 0 }, { "type": "integer", "minimum": 0 }, { "type": "number", "minimum": 0, "maximum": 5 } ] } }, "fragments": { "title": "Fragments", "description": "List of indices grouping atoms (0-indexed) into molecular fragments within the molecule. Each entry in the outer list is a new fragment; index matches the ordering in :attr:`~qcelemental.models.Molecule.fragment_charges` and :attr:`~qcelemental.models.Molecule.fragment_multiplicities`. Inner lists are 0-indexed atoms which compose the fragment; every atom must be in exactly one inner list. Noncontiguous fragments are allowed, though no QM program is known to support them. Fragment ordering is fixed; that is, a consumer who shuffles fragments must not reattach the input (pre-shuffling) molecule schema instance to any output (post-shuffling) per-fragment results (e.g., n-body energy arrays).", "shape": [ "nfr", "<varies>" ], "type": "array", "items": { "type": "array", "items": { "type": "number", "multipleOf": 1.0 } } }, "fragment_charges": { "title": "Fragment Charges", "description": "The total charge of each fragment in the :attr:`~qcelemental.models.Molecule.fragments` list. The index of this list matches the 0-index indices of :attr:`~qcelemental.models.Molecule.fragments` list. Will be filled in based on a set of rules if not provided (and :attr:`~qcelemental.models.Molecule.fragments` are specified).", "shape": [ "nfr" ], "type": "array", "items": { "type": "number" } }, "fragment_multiplicities": { "title": "Fragment Multiplicities", "description": "The multiplicity of each fragment in the :attr:`~qcelemental.models.Molecule.fragments` list. The index of this list matches the 0-index indices of :attr:`~qcelemental.models.Molecule.fragments` list. Will be filled in based on a set of rules if not provided (and :attr:`~qcelemental.models.Molecule.fragments` are specified).", "shape": [ "nfr" ], "type": "array", "items": { "type": "number" } }, "fix_com": { "title": "Fix Com", "description": "Whether translation of geometry is allowed (fix F) or disallowed (fix T).When False, QCElemental will pre-process the Molecule object to translate the center of mass to (0,0,0) in Euclidean coordinate space, resulting in a different :attr:`~qcelemental.models.Molecule.geometry` than the one provided. 'Fix' is used in the sense of 'specify': that is, `fix_com=True` signals that the origin in `geometry` is a deliberate part of the Molecule spec, whereas `fix_com=False` (default) allows that the origin is happenstance and may be adjusted. guidance: A consumer who translates the geometry must not reattach the input (pre-translation) molecule schema instance to any output (post-translation) origin-sensitive results (e.g., an ordinary energy when EFP present).", "default": false, "type": "boolean" }, "fix_orientation": { "title": "Fix Orientation", "description": "Whether rotation of geometry is allowed (fix F) or disallowed (fix T). When False, QCElemental will pre-process the Molecule object to orient via the intertial tensor, resulting in a different :attr:`~qcelemental.models.Molecule.geometry` than the one provided. 'Fix' is used in the sense of 'specify': that is, `fix_orientation=True` signals that the frame orientation in `geometry` is a deliberate part of the Molecule spec, whereas `fix_orientation=False` (default) allows that the frame is happenstance and may be adjusted. guidance: A consumer who rotates the geometry must not reattach the input (pre-rotation) molecule schema instance to any output (post-rotation) frame-sensitive results (e.g., molecular vibrations).", "default": false, "type": "boolean" }, "fix_symmetry": { "title": "Fix Symmetry", "description": "Maximal point group symmetry which :attr:`~qcelemental.models.Molecule.geometry` should be treated. Lowercase.", "type": "string" }, "provenance": { "title": "Provenance", "description": "The provenance information about how this Molecule (and its attributes) were generated, provided, and manipulated.", "allOf": [ { "$ref": "#/definitions/Provenance" } ] }, "id": { "title": "Id", "description": "A unique identifier for this Molecule object. This field exists primarily for Databases (e.g. Fractal's Server) to track and lookup this specific object and should virtually never need to be manually set." }, "extras": { "title": "Extras", "description": "Additional information to bundle with the molecule. Use for schema development and scratch space.", "type": "object" } }, "required": [ "symbols", "geometry" ], "additionalProperties": false, "$schema": "http://json-schema.org/draft-04/schema#", "definitions": { "Identifiers": { "title": "Identifiers", "description": "Canonical chemical identifiers", "type": "object", "properties": { "molecule_hash": { "title": "Molecule Hash", "type": "string" }, "molecular_formula": { "title": "Molecular Formula", "type": "string" }, "smiles": { "title": "Smiles", "type": "string" }, "inchi": { "title": "Inchi", "type": "string" }, "inchikey": { "title": "Inchikey", "type": "string" }, "canonical_explicit_hydrogen_smiles": { "title": "Canonical Explicit Hydrogen Smiles", "type": "string" }, "canonical_isomeric_explicit_hydrogen_mapped_smiles": { "title": "Canonical Isomeric Explicit Hydrogen Mapped Smiles", "type": "string" }, "canonical_isomeric_explicit_hydrogen_smiles": { "title": "Canonical Isomeric Explicit Hydrogen Smiles", "type": "string" }, "canonical_isomeric_smiles": { "title": "Canonical Isomeric Smiles", "type": "string" }, "canonical_smiles": { "title": "Canonical Smiles", "type": "string" }, "pubchem_cid": { "title": "Pubchem Cid", "description": "PubChem Compound ID", "type": "string" }, "pubchem_sid": { "title": "Pubchem Sid", "description": "PubChem Substance ID", "type": "string" }, "pubchem_conformerid": { "title": "Pubchem Conformerid", "description": "PubChem Conformer ID", "type": "string" } }, "additionalProperties": false }, "Provenance": { "title": "Provenance", "description": "Provenance information.", "type": "object", "properties": { "creator": { "title": "Creator", "description": "The name of the program, library, or person who created the object.", "type": "string" }, "version": { "title": "Version", "description": "The version of the creator, blank otherwise. This should be sortable by the very broad `PEP 440 <https://www.python.org/dev/peps/pep-0440/>`_.", "default": "", "type": "string" }, "routine": { "title": "Routine", "description": "The name of the routine or function within the creator, blank otherwise.", "default": "", "type": "string" } }, "required": [ "creator" ], "$schema": "http://json-schema.org/draft-04/schema#" } } }
- Fields:
- Validators:
_int_if_possible
»molecular_multiplicity
_must_be_3n
»geometry
_must_be_n
»masses
_must_be_n
»real
_must_be_n_frag
»fragment_charges
_must_be_n_frag_mult
»fragment_multiplicities
_populate_real
»real
-
field atom_labels:
Optional
[Array
] = None (name 'atom_labels_') Additional per-atom labels as an array of strings. Typical use is in model conversions, such as Elemental <-> Molpro and not typically something which should be user assigned. See the
comment
field for general human-consumable text to affix to the molecule.- Constraints:
type = array
items = {‘type’: ‘string’}
-
field atomic_numbers:
Optional
[Array
] = None (name 'atomic_numbers_') An optional ordered 1-D array-like object of atomic numbers of shape (nat,). Index matches the 0-indexed indices of all other per-atom settings like
symbols
andreal
. Values are inferred from thesymbols
list if not explicitly set. Ghostedness should be indicated throughreal
field, not zeros here.- Constraints:
type = array
items = {‘type’: ‘number’, ‘multipleOf’: 1.0}
-
field comment:
Optional
[str
] = None Additional comments for this molecule. Intended for pure human/user consumption and clarity.
-
field connectivity:
Optional
[List
[Tuple
[NonnegativeInt
,NonnegativeInt
,BondOrderFloat
]]] = None (name 'connectivity_') A list of bonds within the molecule. Each entry is a tuple of
(atom_index_A, atom_index_B, bond_order)
where theatom_index
matches the 0-indexed indices of all other per-atom settings likesymbols
andreal
. Bonds may be freely reordered and inverted.- Constraints:
minItems = 1
-
field extras:
Dict
[str
,Any
] = None Additional information to bundle with the molecule. Use for schema development and scratch space.
-
field fix_com:
bool
= False Whether translation of geometry is allowed (fix F) or disallowed (fix T).When False, QCElemental will pre-process the Molecule object to translate the center of mass to (0,0,0) in Euclidean coordinate space, resulting in a different
geometry
than the one provided. ‘Fix’ is used in the sense of ‘specify’: that is, fix_com=True signals that the origin in geometry is a deliberate part of the Molecule spec, whereas fix_com=False (default) allows that the origin is happenstance and may be adjusted. guidance: A consumer who translates the geometry must not reattach the input (pre-translation) molecule schema instance to any output (post-translation) origin-sensitive results (e.g., an ordinary energy when EFP present).
-
field fix_orientation:
bool
= False Whether rotation of geometry is allowed (fix F) or disallowed (fix T). When False, QCElemental will pre-process the Molecule object to orient via the intertial tensor, resulting in a different
geometry
than the one provided. ‘Fix’ is used in the sense of ‘specify’: that is, fix_orientation=True signals that the frame orientation in geometry is a deliberate part of the Molecule spec, whereas fix_orientation=False (default) allows that the frame is happenstance and may be adjusted. guidance: A consumer who rotates the geometry must not reattach the input (pre-rotation) molecule schema instance to any output (post-rotation) frame-sensitive results (e.g., molecular vibrations).
-
field fix_symmetry:
Optional
[str
] = None Maximal point group symmetry which
geometry
should be treated. Lowercase.
-
field fragment_charges:
Optional
[List
[float
]] = None (name 'fragment_charges_') The total charge of each fragment in the
fragments
list. The index of this list matches the 0-index indices offragments
list. Will be filled in based on a set of rules if not provided (andfragments
are specified).- Validated by:
_must_be_n_frag
-
field fragment_multiplicities:
Optional
[List
[float
]] = None (name 'fragment_multiplicities_') The multiplicity of each fragment in the
fragments
list. The index of this list matches the 0-index indices offragments
list. Will be filled in based on a set of rules if not provided (andfragments
are specified).- Validated by:
_must_be_n_frag_mult
-
field fragments:
Optional
[List
[Array
]] = None (name 'fragments_') List of indices grouping atoms (0-indexed) into molecular fragments within the molecule. Each entry in the outer list is a new fragment; index matches the ordering in
fragment_charges
andfragment_multiplicities
. Inner lists are 0-indexed atoms which compose the fragment; every atom must be in exactly one inner list. Noncontiguous fragments are allowed, though no QM program is known to support them. Fragment ordering is fixed; that is, a consumer who shuffles fragments must not reattach the input (pre-shuffling) molecule schema instance to any output (post-shuffling) per-fragment results (e.g., n-body energy arrays).
-
field geometry:
Array
[Required] The ordered array for Cartesian XYZ atomic coordinates [a0]. Atom ordering is fixed; that is, a consumer who shuffles atoms must not reattach the input (pre-shuffling) molecule schema instance to any output (post-shuffling) per-atom results (e.g., gradient). Index of the first dimension matches the 0-indexed indices of all other per-atom settings like
symbols
andreal
. Serialized storage is always flat, (3*nat,), but QCSchema implementations will want to reshape it. QCElemental can also accept array-likes which can be mapped to (nat,3) such as a 1-D list of length 3*nat, or the serialized version of the array in (3*nat,) shape; all forms will be reshaped to (nat,3) for this attribute.- Constraints:
type = array
items = {‘type’: ‘number’}
- Validated by:
_must_be_3n
-
field id:
Optional
[Any
] = None A unique identifier for this Molecule object. This field exists primarily for Databases (e.g. Fractal’s Server) to track and lookup this specific object and should virtually never need to be manually set.
-
field identifiers:
Optional
[Identifiers
] = None An optional dictionary of additional identifiers by which this molecule can be referenced, such as INCHI, canonical SMILES, etc. See the
Identifiers
model for more details.
-
field mass_numbers:
Optional
[Array
] = None (name 'mass_numbers_') An optional ordered 1-D array-like object of atomic mass numbers of shape (nat). Index matches the 0-indexed indices of all other per-atom settings like
symbols
andreal
. Values are inferred from the most common isotopes of thesymbols
list if not explicitly set. If single isotope not (yet) known for an atom, -1 is placeholder.- Constraints:
type = array
items = {‘type’: ‘number’, ‘multipleOf’: 1.0}
-
field masses:
Optional
[Array
] = None (name 'masses_') The ordered array of atomic masses. Index order matches the 0-indexed indices of all other per-atom fields like
symbols
andreal
. If this is not provided, the mass of each atom is inferred from its most common isotope. If this is provided, it must be the same length assymbols
but can acceptNone
entries for standard masses to infer from the same index in thesymbols
field.- Constraints:
type = array
items = {‘type’: ‘number’}
- Validated by:
_must_be_n
-
field molecular_charge:
float
= 0.0 The net electrostatic charge of the molecule.
-
field molecular_multiplicity:
float
= 1 The total multiplicity of the molecule.
- Validated by:
_int_if_possible
-
field name:
Optional
[str
] = None Common or human-readable name to assign to this molecule. This field can be arbitrary; see
identifiers
for well-defined labels.
-
field provenance:
Provenance
[Optional] The provenance information about how this Molecule (and its attributes) were generated, provided, and manipulated.
-
field real:
Optional
[Array
] = None (name 'real_') The ordered array indicating if each atom is real (
True
) or ghost/virtual (False
). Index matches the 0-indexed indices of all other per-atom settings likesymbols
and the first dimension ofgeometry
. If this is not provided, all atoms are assumed to be real (True
).If this is provided, the reality or ghostedness of every atom must be specified.- Constraints:
type = array
items = {‘type’: ‘boolean’}
- Validated by:
_must_be_n
_populate_real
-
field schema_name:
ConstrainedStrValue
= 'qcschema_molecule' The QCSchema specification to which this model conforms. Explicitly fixed as qcschema_molecule.
- Constraints:
pattern = ^(qcschema_molecule)$
-
field schema_version:
int
= 2 The version number of
schema_name
to which this model conforms.
-
field symbols:
Array
[Required] The ordered array of atomic elemental symbols in title case. This field’s index sets atomic order for all other per-atom fields like
real
and the first dimension ofgeometry
. Ghost/virtual atoms must have an entry here insymbols
; ghostedness is indicated through thereal
field.- Constraints:
type = array
items = {‘type’: ‘string’}
-
field validated:
bool
= False A boolean indicator (for speed purposes) that the input Molecule data has been previously checked for schema (data layout and type) and physics (e.g., non-overlapping atoms, feasible multiplicity) compliance. This should be False in most cases. A
True
setting should only ever be set by the constructor for this class itself or other trusted sources such as a Fractal Server or previously serialized Molecules.
- align(ref_mol, *, do_plot=False, verbose=0, atoms_map=False, run_resorting=False, mols_align=False, run_to_completion=False, uno_cutoff=0.001, run_mirror=False, generic_ghosts=False)[source]
Finds shift, rotation, and atom reordering of concern_mol (self) that best aligns with ref_mol.
Wraps
qcelemental.molutil.B787()
forqcelemental.models.Molecule
. Employs the Kabsch, Hungarian, and Uno algorithms to exhaustively locate the best alignment for non-oriented, non-ordered structures.- Parameters:
ref_mol (qcelemental.models.Molecule) – Molecule to match.
atoms_map (
bool
) – Whether atom1 of ref_mol corresponds to atom1 of concern_mol, etc. If true, specifying True can save much time.mols_align (
Union
[bool
,float
]) – Whether ref_mol and concern_mol have identical geometries (barring orientation or atom mapping) and expected final RMSD = 0. If True, procedure is truncated when RMSD condition met, saving time. If float, RMSD tolerance at which search for alignment stops. If provided, the alignment routine will throw an error if it fails to align the molecule within the specified RMSD tolerance.do_plot (
bool
) – Pops up a mpl plot showing before, after, and ref geometries.run_to_completion (
bool
) – Run reorderings to completion (past RMSD = 0) even if unnecessary because mols_align=True. Used to test worst-case timings.run_resorting (
bool
) – Run the resorting machinery even if unnecessary because atoms_map=True.uno_cutoff (
float
) – TODOrun_mirror (
bool
) – Run alternate geometries potentially allowing best match to ref_mol from mirror image of concern_mol. Only run if system confirmed to be nonsuperimposable upon mirror reflection.generic_ghosts (
bool
) – When one or both molecules doesn’t have meaningful element info for ghosts (can happen when harvesting from a printout with a generic ghost symbol), set this to True to place all real=False atoms into the same space for alignment. Only allowed whenatoms_map=True
.verbose (
int
) – Print level.
- Return type:
- Returns:
mol (Molecule)
data (Dict[key, Any]) – Molecule is internal geometry of self optimally aligned and atom-ordered to ref_mol. Presently all fragment information is discarded. data[‘rmsd’] is RMSD [A] between ref_mol and the optimally aligned geometry computed. data[‘mill’] is a AlignmentMill with fields (shift, rotation, atommap, mirror) that prescribe the transformation from concern_mol and the optimally aligned geometry.
- compare(other)[source]
Compares the current object to the provided object recursively.
- Parameters:
other – The model to compare to.
**kwargs – Additional kwargs to pass to
compare_recursive()
.
- Returns:
True if the objects match.
- Return type:
- dict(*args, **kwargs)[source]
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- element_composition(ifr=None, real_only=True)[source]
Atomic count map.
- Parameters:
- Returns:
composition – Atomic count map.
- Return type:
Notes
This excludes ghost atoms by default whereas get_molecular_formula always includes them.
- classmethod from_data(data, dtype=None, *, orient=False, validate=None, **kwargs)[source]
Constructs a molecule object from a data structure.
- Parameters:
data (
Union
[str
,Dict
[str
,Any
],ndarray
,bytes
]) – Data to construct Molecule fromdtype (
Optional
[str
]) – How to interpret the data, if not passed attempts to discover this based on input type.orient (
bool
) – Orientates the molecule to a standard frame or not.**kwargs (
Dict
[str
,Any
]) – Additional kwargs to pass to the constructors. kwargs take precedence over data.
- Returns:
A constructed molecule class.
- Return type:
- classmethod from_file(filename, dtype=None, *, orient=False, **kwargs)[source]
Constructs a molecule object from a file.
- get_fragment(real, ghost=None, orient=False, group_fragments=True)[source]
Get new Molecule with fragments preserved, dropped, or ghosted.
- Parameters:
real (
Union
[int
,List
]) – Fragment index or list of indices (0-indexed) to be real atoms in new Molecule.ghost (
Union
[int
,List
,None
]) – Fragment index or list of indices (0-indexed) to be ghost atoms (basis fns only) in new Molecule.orient (
bool
) – Whether or not to align (inertial frame) and phase geometry upon new Molecule instantiation (according to _orient_molecule_internal)?group_fragments (
bool
) – Whether or not to group real fragments at the start of the atom list and ghost fragments toward the back. Previous tov0.5
, this was always effectively True. True is handy for finding duplicate (atom-order-independent) molecules by hash. False preserves fragment order (though collapsing gaps for absent fragments) like Psi4’sextract_subsets
. False is handy for gradients where atom order of returned values matters.
- Returns:
New qcelemental.models.Molecule with
self
's fragments present, ghosted, or absent.- Return type:
- get_hash()[source]
Returns the hash of the molecule.
- get_molecular_formula(order='alphabetical', chgmult=False)[source]
Returns the molecular formula for a molecule.
- Parameters:
- Returns:
The molecular formula.
- Return type:
Examples
>>> methane = qcelemental.models.Molecule(''' ... H 0.5288 0.1610 0.9359 ... C 0.0000 0.0000 0.0000 ... H 0.2051 0.8240 -0.6786 ... H 0.3345 -0.9314 -0.4496 ... H -1.0685 -0.0537 0.1921 ... ''') >>> methane.get_molecular_formula() CH4
>>> hcl = qcelemental.models.Molecule(''' ... H 0.0000 0.0000 0.0000 ... Cl 0.0000 0.0000 1.2000 ... ''') >>> hcl.get_molecular_formula() ClH
>>> two_pentanol_radcat = qcelemental.models.Molecule(''' ... 1 2 ... C -4.43914 1.67538 -0.14135 ... C -2.91385 1.70652 -0.10603 ... H -4.82523 2.67391 -0.43607 ... H -4.84330 1.41950 0.86129 ... H -4.79340 0.92520 -0.88015 ... H -2.59305 2.48187 0.62264 ... H -2.53750 1.98573 -1.11429 ... C -2.34173 0.34025 0.29616 ... H -2.72306 0.06156 1.30365 ... C -0.80326 0.34498 0.31454 ... H -2.68994 -0.42103 -0.43686 ... O -0.32958 1.26295 1.26740 ... H -0.42012 0.59993 -0.70288 ... C -0.26341 -1.04173 0.66218 ... H -0.61130 -1.35318 1.67053 ... H 0.84725 -1.02539 0.65807 ... H -0.60666 -1.78872 -0.08521 ... H -0.13614 2.11102 0.78881 ... ''') >>> two_pentanol_radcat.get_molecular_formula(chgmult=True) 2^C5H12O+
Notes
This includes all atoms in the molecule, including ghost atoms. See
element_composition()
to exclude.
- measure(measurements, *, degrees=True)[source]
Takes a measurement of the moleucle from the indicies provided.
- Parameters:
measurements (
Union
[List
[int
],List
[List
[int
]]]) – Either a single list of indices or multiple. Return a distance, angle, or dihedral depending if 2, 3, or 4 indices is provided, respectively. Values are returned in Bohr (distance) or degree.degrees (
bool
) – Returns degrees by default, radians otherwise.
- Returns:
Either a value or list of the measured values.
- Return type:
- molecular_weight(ifr=None, real_only=True)[source]
Molecular weight in uamu.
- nelectrons(ifr=None, real_only=True)[source]
Number of electrons.
- nuclear_repulsion_energy(ifr=None, real_only=True)[source]
Nuclear repulsion energy.
- orient_molecule()[source]
Centers the molecule and orients via inertia tensor before returning a new Molecule
- pretty_print()[source]
Print the molecule in Angstroms. Same as
print_out()
only always in Angstroms. (method name in libmints is print_in_angstrom)
- scramble(*, do_shift=True, do_rotate=True, do_resort=True, deflection=1.0, do_mirror=False, do_plot=False, do_test=False, run_to_completion=False, run_resorting=False, verbose=0)[source]
Generate a Molecule with random or directed translation, rotation, and atom shuffling. Optionally, check that the aligner returns the opposite transformation.
- Parameters:
ref_mol (qcelemental.models.Molecule) – Molecule to perturb.
do_shift (
Union
[bool
,Array
,List
]) – Whether to generate a random atom shift on interval [-3, 3) in each dimension (True) or leave at current origin. To shift by a specified vector, supply a 3-element list.do_rotate (
Union
[bool
,Array
,List
[List
]]) – Whether to generate a random 3D rotation according to algorithm of Arvo. To rotate by a specified matrix, supply a 9-element list of lists.do_resort (
Union
[bool
,List
]) – Whether to shuffle atoms (True) or leave 1st atom 1st, etc. (False). To specify shuffle, supply a nat-element list of indices.deflection (
float
) – If do_rotate, how random a rotation: 0.0 is no change, 0.1 is small perturbation, 1.0 is completely random.do_mirror (
bool
) – Whether to construct the mirror image structure by inverting y-axis.do_plot (
bool
) – Pops up a mpl plot showing before, after, and ref geometries.do_test (
bool
) – Additionally, run the aligner on the returned Molecule and check that opposite transformations obtained.run_to_completion (
bool
) – By construction, scrambled systems are fully alignable (final RMSD=0). Even so, True turns off the mechanism to stop when RMSD reaches zero and instead proceed to worst possible time.run_resorting (
bool
) – Even if atoms not shuffled, test the resorting machinery.verbose (
int
) – Print level.
- Return type:
- Returns:
mol (Molecule)
data (Dict[key, Any]) – Molecule is scrambled copy of ref_mol (self). data[‘rmsd’] is RMSD [A] between ref_mol and the scrambled geometry. data[‘mill’] is a AlignmentMill with fields (shift, rotation, atommap, mirror) that prescribe the transformation from ref_mol to the returned geometry.
- Raises:
AssertionError – If do_test=True and aligner sanity check fails for any of the reverse transformations.
- show(ngl_kwargs=None)[source]
Creates a 3D representation of a molecule that can be manipulated in Jupyter Notebooks and exported as images (.png).
- to_file(filename, dtype=None)[source]
Writes the Molecule to a file.
- to_string(dtype, units=None, *, atom_format=None, ghost_format=None, width=17, prec=12, return_data=False)[source]
Returns a string that can be used by a variety of programs.
Unclear if this will be removed or renamed to “to_psi4_string” in the future
Suggest psi4 –> psi4frag and psi4 route to to_string
- property atom_labels: Array
- property atomic_numbers: Array
- property fragments: List[Array]
- property hash_fields
- property mass_numbers: Array
- property masses: Array
- property real: Array