from_string
- qcelemental.molparse.from_string(molstr, dtype=None, *, name=None, fix_com=None, fix_orientation=None, fix_symmetry=None, return_processed=False, enable_qm=True, enable_efp=True, missing_enabled_return_qm='none', missing_enabled_return_efp='none', verbose=1)[source]
Construct a molecule dictionary from any recognized string format.
- Parameters:
molstr (
str
) – Multiline string specification of molecule in a recognized format.dtype (
Optional
[str
]) – {‘xyz’, ‘xyz+’, ‘psi4’, ‘psi4+’} Molecule format name; see below for details.return_processed (
bool
) – Additionally return intermediate dictionary.enable_qm (
bool
) – Consider quantum mechanical domain in processing the string constants into the returned molrec.enable_efp (
bool
) – Consider effective fragment potential domain in processing the string contents into the returned molrec. Only relevant if dtype supports EFP.missing_enabled_return_qm (
str
) – {‘minimal’, ‘none’, ‘error’} If enable_qm=True, what to do if it has no atoms/fragments? Respectively, return a fully valid but empty molrec, return empty dictionary, or throw error.missing_enabled_return_efp (
str
) – {‘minimal’, ‘none’, ‘error’} If enable_efp=True, what to do if it has no atoms/fragments? Respectively, return a fully valid but empty molrec, return empty dictionary, or throw error.name (
Optional
[str
]) – Override molstr information for label for molecule; should be valid Python identifier. One of a very limited number of fields (three others follow) for trumping molstr. Provided for convenience, since the alternative would be collect the resulting molrec (discarding the Mol if called from class), editing it, then remaking the Mol.fix_com (
Optional
[bool
]) – Override molstr information for whether translation of geom is allowed or disallowed.fix_orientation (
Optional
[bool
]) – Override molstr information for whether rotation of geom is allowed or disallowed.fix_symmetry (
Optional
[str
]) – Override molstr information for maximal point group symmetry which geometry should be treated.
- Return type:
- Returns:
molrec (dict) – Molecule dictionary spec. See
from_arrays()
.molinit (dict, optional) – Intermediate “molrec”-like dictionary containing molstr info after parsing by this function but before the validation and defaulting of from_arrays that returns the proper molrec. Only provided if return_processed is True.
- Raises:
qcelemental.MoleculeFormatError – After processing of molstr, only an empty string should remain. Anything left is a syntax error.
Notes
Several formats are interpretable:
xyz - Strict XYZ format ----------------------- String Layout ------------- <number of atoms> comment line <element_symbol or atomic_number> <x> <y> <z> ... <element_symbol or atomic_number> <x> <y> <z> QM Domain --------- Specifiable: geom, elem/elez (element identity) Inaccessible: mass, real (vs. ghost), elbl (user label), name, units (assumed [A]), input_units_to_au, fix_com/orientation/symmetry, fragmentation, molecular_charge, molecular_multiplicity Notes ----- <number of atoms> is pattern-matched but ignored. xyz+ - Enhanced XYZ format -------------------------- String Layout ------------- <number of atoms> [<bohr|au|ang>] [<molecular_charge> <molecular_multiplicity>] comment line <psi4_nucleus_spec> <x> <y> <z> ... <psi4_nucleus_spec> <x> <y> <z> QM Domain --------- Specifiable: geom, elem/elez (element identity), mass, real (vs. ghost), elbl (user label), units (defaults [A]), molecular_charge, molecular_multiplicity Inaccessible: name, input_units_to_au, fix_com/orientation/symmetry, fragmentation Notes ----- <number of atoms> is pattern-matched but ignored. psi4 - Psi4 molecule {...} format --------------------------------- QM Domain --------- Specifiable: geom, elem/elez (element identity), mass, real (vs. ghost), elbl (user label), units (defaults [A]), fix_com/orientation/symmetry, fragment_separators, fragment_charges, fragment_multiplicities, molecular_charge, molecular_multiplicity Inaccessible: name, input_units_to_au PubChem ------- pubchem : <cid|name|formula> [*] A string like the above searches the PubChem database and substitutes the below. Adding the wildcard searches for multiple matches and raises ChoicesError with matches for further consideration attached. Specifiable: geom, elem/elez (element identity), units (fixed [A]), molecular_charge, molecular_multiplicity (fixed singlet), name EFP Domain ---------- Specifiable: units, fix_com/orientation/symmetry, fragment_files, hint_types, geom_hints Inaccessible: anything atomic or fragment details -- geom, elem/elez (element identity), mass, real (vs. ghost), elbl (user label), fragment_separators, fragment_charges, fragment_multiplicities, molecular_charge, molecular_multiplicity psi4+ - Psi4 non-Cartesian molecule {...} format ------------------------------------------------ Like `dtype=psi4` (although combination with EFP not tested) except that instead of pure-Cartesian geometry, allow variables, zmatrix, and un-fully-specified geometries. *Not* MolSSI standard, but we're not dropping zmatrix yet. Note that in Psi4 internal coordinates defined through a zmatrix have no bearing on geometry optimization internals or constraints.