Environment Detection
QCEngine can inspect the current compute environment to determine the resources available to it.
Node Description
QCEngine can detect node descriptions to obtain general information about the current node.
>>> qcng.config.get_node_descriptor()
<NodeDescriptor hostname_pattern='*' name='default' scratch_directory=None
memory=5.568 memory_safety_factor=10 ncores=4 jobs_per_node=2>
Config
The configuration file operated based on the current node descriptor and can be overridden:
>>> qcng.get_config()
<JobConfig ncores=2 memory=2.506 scratch_directory=None>
>>> qcng.get_config(task_config={"scratch_directory": "/tmp"})
<JobConfig ncores=2 memory=2.506 scratch_directory='/tmp'>
>>> os.environ["SCRATCH"] = "/my_scratch"
>>> qcng.get_config(task_config={"scratch_directory": "$SCRATCH"})
<JobConfig ncores=2 memory=2.506 scratch_directory='/my_scratch'>
Global Environment
The global environment can also be inspected directly.
>>> qcng.config.get_global()
{
'hostname': 'qcarchive.molssi.org',
'memory': 5.568,
'username': 'user',
'ncores': 4,
'cpuinfo': {
'python_version': '3.6.7.final.0 (64 bit)',
'brand': 'Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz',
'hz_advertised': '2.9000 GHz',
...
},
'cpu_brand': 'Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz'
}
Configuration Files
The computational environment defaults can be overridden by configuration files.
Configuration files must be named qcengine.yaml
and stored either in the directory
from which you run QCEngine, a folder named .qcarchive
in your home directory,
or in a folder specified by the DQM_CONFIG_PATH
environmental variable.
Only one configuration file will be used if multiple are available.
The DQM_CONFIG_PATH
configuration file takes precedence over the current directory,
which takes precedence over the .qcarchive
folder.
The configuration file is a YAML file that contains a dictionary of different node configurations.
The keys in the YAML file are human-friendly names for the configurations.
The values are dictionaries that define configurations for different nodes,
following the NodeDescription
schema:
- pydantic model qcengine.config.NodeDescriptor[source]
Description of an individual node
Show JSON schema
{ "title": "NodeDescriptor", "description": "Description of an individual node", "type": "object", "properties": { "hostname_pattern": { "title": "Hostname Pattern", "type": "string" }, "name": { "title": "Name", "type": "string" }, "scratch_directory": { "title": "Scratch Directory", "type": "string" }, "memory": { "title": "Memory", "type": "number" }, "memory_safety_factor": { "title": "Memory Safety Factor", "default": 10, "type": "integer" }, "ncores": { "title": "Ncores", "description": "Number of cores accessible to each task on this node\n \n The default value, ``None``, will allow QCEngine to autodetect the number of cores.", "type": "integer" }, "jobs_per_node": { "title": "Jobs Per Node", "default": 1, "type": "integer" }, "retries": { "title": "Retries", "default": 0, "type": "integer" }, "is_batch_node": { "title": "Is Batch Node", "default": false, "help": "Whether the node running QCEngine is a batch node\n \n Some clusters are configured such that tasks are launched from a special \"batch\" or \"MOM\" onto the compute nodes.\n The compute nodes on such clusters often have a different CPU architecture than the batch nodes and \n often are unable to launch MPI tasks, which has two implications:\n 1) QCEngine must make *all* calls to an executable via ``mpirun`` because the executables might not\n be able to run on the batch node. \n 2) QCEngine must run on the batch node to be able to launch tasks on the more than one compute nodes \n \n ``is_batch_node`` is used when creating the task configuration as a means of determining whether\n ``mpiexec_command`` must always be used even for serial jobs (e.g., getting the version number)\n ", "type": "boolean" }, "mpiexec_command": { "title": "Mpiexec Command", "description": "Invocation for launching node-parallel tasks with MPI\n \n The invocation need not specify the number of nodes, tasks, or cores per node.\n Information about the task configuration will be added to the command by use of\n Python's string formatting. The configuration will be supplied as the following variables:\n \n {nnodes} - Number of nodes\n {ranks_per_node} - Number of MPI ranks per node\n {cores_per_rank} - Number of cores to use for each MPI rank\n {total_ranks} - Total number of MPI ranks\n \n As examples, the ``aprun`` command on Cray systems should be similar to \n ``aprun -n {total_ranks} -N {ranks_per_node}`` and ``mpirun`` from OpenMPI should\n be similar to ``mpirun -np {total_ranks} -N {ranks_per_node}``.\n \n Programs where each MPI rank can use multiple threads (e.g., QC programs with MPI+OpenMP) can \n use the {cores_per_rank} option to control the hybrid parallelism. \n As an example, the Cray ``aprun`` command using this figure could be:\n ``aprun -n {total_ranks} -N {ranks_per_node} -d {cores_per_rank} -j 1``.\n The appropriate number of ranks per node will be determined based on the number of\n cores per node and the number of cores per rank.\n ", "type": "string" } }, "required": [ "hostname_pattern", "name" ], "additionalProperties": false }
- Fields:
-
field hostname_pattern:
str
[Required]
-
field is_batch_node:
bool
= False
-
field jobs_per_node:
int
= 1
-
field memory_safety_factor:
int
= 10
-
field mpiexec_command:
Optional
[str
] = None Invocation for launching node-parallel tasks with MPI
The invocation need not specify the number of nodes, tasks, or cores per node. Information about the task configuration will be added to the command by use of Python’s string formatting. The configuration will be supplied as the following variables:
{nnodes} - Number of nodes {ranks_per_node} - Number of MPI ranks per node {cores_per_rank} - Number of cores to use for each MPI rank {total_ranks} - Total number of MPI ranks
As examples, the
aprun
command on Cray systems should be similar toaprun -n {total_ranks} -N {ranks_per_node}
andmpirun
from OpenMPI should be similar tompirun -np {total_ranks} -N {ranks_per_node}
.Programs where each MPI rank can use multiple threads (e.g., QC programs with MPI+OpenMP) can use the {cores_per_rank} option to control the hybrid parallelism. As an example, the Cray
aprun
command using this figure could be:aprun -n {total_ranks} -N {ranks_per_node} -d {cores_per_rank} -j 1
. The appropriate number of ranks per node will be determined based on the number of cores per node and the number of cores per rank.
-
field name:
str
[Required]
-
field ncores:
Optional
[int
] = None Number of cores accessible to each task on this node
The default value,
None
, will allow QCEngine to autodetect the number of cores.
-
field retries:
int
= 0
When running QCEngine, the proper configuration for a node is determined based on the hostname of the node
and matching the hostname_pattern
to each of the configurations defined in qcengine.yaml
.
An example qcengine.yaml
file that sets the scratch directory for all nodes is as follows:
all:
hostname_pattern: "*"
scratch_directory: ./scratch
Cluster Configuration
A node configuration file is required when using node-parallel tasks on a compute cluster.
The configuration file must contain a description of the command used to launch MPI tasks and,
in some cases, the designation that a certain node is a compute node.
See the descriptions for mpiexec_command
and is_batch_node
in the NodeDescriptor
documentation for further details.