Environment Detection

QCEngine can inspect the current compute environment to determine the resources available to it.

Node Description

QCEngine can detect node descriptions to obtain general information about the current node.

>>> qcng.config.get_node_descriptor()
<NodeDescriptor hostname_pattern='*' name='default' scratch_directory=None
                memory=5.568 memory_safety_factor=10 ncores=4 jobs_per_node=2>

Config

The configuration file operated based on the current node descriptor and can be overridden:

>>> qcng.get_config()
<JobConfig ncores=2 memory=2.506 scratch_directory=None>

>>> qcng.get_config(task_config={"scratch_directory": "/tmp"})
<JobConfig ncores=2 memory=2.506 scratch_directory='/tmp'>

>>> os.environ["SCRATCH"] = "/my_scratch"
>>> qcng.get_config(task_config={"scratch_directory": "$SCRATCH"})
<JobConfig ncores=2 memory=2.506 scratch_directory='/my_scratch'>

Global Environment

The global environment can also be inspected directly.

>>> qcng.config.get_global()
{
    'hostname': 'qcarchive.molssi.org',
    'memory': 5.568,
    'username': 'user',
    'ncores': 4,
    'cpuinfo': {
        'python_version': '3.6.7.final.0 (64 bit)',
        'brand': 'Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz',
        'hz_advertised': '2.9000 GHz',
        ...
    },
    'cpu_brand': 'Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz'
}

Configuration Files

The computational environment defaults can be overridden by configuration files.

Configuration files must be named qcengine.yaml and stored either in the directory from which you run QCEngine, a folder named .qcarchive in your home directory, or in a folder specified by the DQM_CONFIG_PATH environmental variable. Only one configuration file will be used if multiple are available. The DQM_CONFIG_PATH configuration file takes precedence over the current directory, which takes precedence over the .qcarchive folder.

The configuration file is a YAML file that contains a dictionary of different node configurations. The keys in the YAML file are human-friendly names for the configurations. The values are dictionaries that define configurations for different nodes, following the NodeDescription schema:

pydantic model qcengine.config.NodeDescriptor[source]

Description of an individual node

Show JSON schema
{
   "title": "NodeDescriptor",
   "description": "Description of an individual node",
   "type": "object",
   "properties": {
      "hostname_pattern": {
         "title": "Hostname Pattern",
         "type": "string"
      },
      "name": {
         "title": "Name",
         "type": "string"
      },
      "scratch_directory": {
         "title": "Scratch Directory",
         "type": "string"
      },
      "memory": {
         "title": "Memory",
         "type": "number"
      },
      "memory_safety_factor": {
         "title": "Memory Safety Factor",
         "default": 10,
         "type": "integer"
      },
      "ncores": {
         "title": "Ncores",
         "description": "Number of cores accessible to each task on this node\n    \n    The default value, ``None``, will allow QCEngine to autodetect the number of cores.",
         "type": "integer"
      },
      "jobs_per_node": {
         "title": "Jobs Per Node",
         "default": 1,
         "type": "integer"
      },
      "retries": {
         "title": "Retries",
         "default": 0,
         "type": "integer"
      },
      "is_batch_node": {
         "title": "Is Batch Node",
         "default": false,
         "help": "Whether the node running QCEngine is a batch node\n    \n    Some clusters are configured such that tasks are launched from a special \"batch\" or \"MOM\" onto the compute nodes.\n    The compute nodes on such clusters often have a different CPU architecture than the batch nodes and \n    often are unable to launch MPI tasks, which has two implications:\n        1) QCEngine must make *all* calls to an executable via ``mpirun`` because the executables might not\n        be able to run on the batch node. \n        2) QCEngine must run on the batch node to be able to launch tasks on the more than one compute nodes  \n    \n    ``is_batch_node`` is used when creating the task configuration as a means of determining whether\n    ``mpiexec_command`` must always be used even for serial jobs (e.g., getting the version number)\n    ",
         "type": "boolean"
      },
      "mpiexec_command": {
         "title": "Mpiexec Command",
         "description": "Invocation for launching node-parallel tasks with MPI\n        \n        The invocation need not specify the number of nodes, tasks, or cores per node.\n        Information about the task configuration will be added to the command by use of\n        Python's string formatting. The configuration will be supplied as the following variables:\n        \n            {nnodes} - Number of nodes\n            {ranks_per_node} - Number of MPI ranks per node\n            {cores_per_rank} - Number of cores to use for each MPI rank\n            {total_ranks} - Total number of MPI ranks\n            \n        As examples, the ``aprun`` command on Cray systems should be similar to \n        ``aprun -n {total_ranks} -N {ranks_per_node}`` and ``mpirun`` from OpenMPI should\n        be similar to ``mpirun -np {total_ranks} -N {ranks_per_node}``.\n        \n        Programs where each MPI rank can use multiple threads (e.g., QC programs with MPI+OpenMP) can \n        use the {cores_per_rank} option to control the hybrid parallelism. \n        As an example, the Cray ``aprun`` command using this figure could be:\n        ``aprun -n {total_ranks} -N {ranks_per_node} -d {cores_per_rank} -j 1``.\n        The appropriate number of ranks per node will be determined based on the number of\n        cores per node and the number of cores per rank.\n        ",
         "type": "string"
      }
   },
   "required": [
      "hostname_pattern",
      "name"
   ],
   "additionalProperties": false
}

Fields:
field hostname_pattern: str [Required]
field is_batch_node: bool = False
field jobs_per_node: int = 1
field memory: Optional[float] = None
field memory_safety_factor: int = 10
field mpiexec_command: Optional[str] = None

Invocation for launching node-parallel tasks with MPI

The invocation need not specify the number of nodes, tasks, or cores per node. Information about the task configuration will be added to the command by use of Python’s string formatting. The configuration will be supplied as the following variables:

{nnodes} - Number of nodes {ranks_per_node} - Number of MPI ranks per node {cores_per_rank} - Number of cores to use for each MPI rank {total_ranks} - Total number of MPI ranks

As examples, the aprun command on Cray systems should be similar to aprun -n {total_ranks} -N {ranks_per_node} and mpirun from OpenMPI should be similar to mpirun -np {total_ranks} -N {ranks_per_node}.

Programs where each MPI rank can use multiple threads (e.g., QC programs with MPI+OpenMP) can use the {cores_per_rank} option to control the hybrid parallelism. As an example, the Cray aprun command using this figure could be: aprun -n {total_ranks} -N {ranks_per_node} -d {cores_per_rank} -j 1. The appropriate number of ranks per node will be determined based on the number of cores per node and the number of cores per rank.

field name: str [Required]
field ncores: Optional[int] = None

Number of cores accessible to each task on this node

The default value, None, will allow QCEngine to autodetect the number of cores.

field retries: int = 0
field scratch_directory: Optional[str] = None

When running QCEngine, the proper configuration for a node is determined based on the hostname of the node and matching the hostname_pattern to each of the configurations defined in qcengine.yaml.

An example qcengine.yaml file that sets the scratch directory for all nodes is as follows:

all:
  hostname_pattern: "*"
  scratch_directory: ./scratch

Cluster Configuration

A node configuration file is required when using node-parallel tasks on a compute cluster. The configuration file must contain a description of the command used to launch MPI tasks and, in some cases, the designation that a certain node is a compute node. See the descriptions for mpiexec_command and is_batch_node in the NodeDescriptor documentation for further details.