How do I search for specific types of computations?#

This notebook introduces you to the basics of connecting to a QCArchive server and retrieving computation results using information like molecule, basis set, method, or other computation details.

You can retrieve results from QCArchive using the get_records method if you know the ID of the computation you’d like to retrieve. However, you can also query the database for computations having specific details using query methods.

import qcportal as ptl

Create a client object and connect to the demo server#

The PortalClient is how you interact with the server, including querying records and submitting computations.

The demo server allows for unauthenticated guest access, so no username/password is necessary to read from the server. However, you will need to log in to submit or modify computations.

# Guest access
client = ptl.PortalClient("https://qcademo.molssi.org")
WARNING: This client version is newer than the server version. This may work if the versions are close, but expect exceptions and errors if attempting things the server does not support. client version: 0.53.post29+gd4bb2f0e, server version: 0.53

Connecting with username/password

If you have a username/password, you would include those in the client connection.

client = ptl.PortalClient("https://qcademo.molssi.org", username="YOUR_USERNAME", password="YOUR_PASSWORD")

⚠️Caution⚠️: Always handle credentials with care. Never commit sensitive information like usernames or passwords to public repositories.

Querying Records#

Use the `query_records method`` for general queries. This method allows you to search across all records in the database, regardless of the computation type. Please note that since query_records searches all record types, you can only query fields that are common to all records.

help(client.query_records)
Help on method query_records in module qcportal.client:

query_records(*, record_id: 'Optional[Union[int, Iterable[int]]]' = None, record_type: 'Optional[Union[str, Iterable[str]]]' = None, manager_name: 'Optional[Union[str, Iterable[str]]]' = None, status: 'Optional[Union[RecordStatusEnum, Iterable[RecordStatusEnum]]]' = None, dataset_id: 'Optional[Union[int, Iterable[int]]]' = None, parent_id: 'Optional[Union[int, Iterable[int]]]' = None, child_id: 'Optional[Union[int, Iterable[int]]]' = None, created_before: 'Optional[Union[datetime, str]]' = None, created_after: 'Optional[Union[datetime, str]]' = None, modified_before: 'Optional[Union[datetime, str]]' = None, modified_after: 'Optional[Union[datetime, str]]' = None, owner_user: 'Optional[Union[int, str, Iterable[Union[int, str]]]]' = None, owner_group: 'Optional[Union[int, str, Iterable[Union[int, str]]]]' = None, limit: 'int' = None, include: 'Optional[Iterable[str]]' = None) -> 'RecordQueryIterator[BaseRecord]' method of qcportal.client.PortalClient instance
    Query records of all types based on common fields
    
    This is a general query of all record types, so it can only filter by fields
    that are common among all records.
    
    Do not rely on the returned records being in any particular order.
    
    Parameters
    ----------
    record_id
        Query records whose ID is in the given list
    record_type
        Query records whose type is in the given list
    manager_name
        Query records that were completed (or are currently runnning) on a manager is in the given list
    status
        Query records whose status is in the given list
    dataset_id
        Query records that are part of a dataset is in the given list
    parent_id
        Query records that have a parent is in the given list
    child_id
        Query records that have a child is in the given list
    created_before
        Query records that were created before the given date/time
    created_after
        Query records that were created after the given date/time
    modified_before
        Query records that were modified before the given date/time
    modified_after
        Query records that were modified after the given date/time
    owner_user
        Query records owned by a user in the given list (usernames or IDs)
    owner_group
        Query records owned by a group in the given list (group names or IDS)
    limit
        The maximum number of records to return. Note that the server limit is always obeyed.
    include
        Additional fields to include in the returned record
    
    Returns
    -------
    :
        An iterator that can be used to retrieve the results of the query

For example, to query for computations created between January 10, 2023 and January 14, 2023, we could do the following.

results = client.query_records(created_after="2023/01/10", created_before="2023/01/14")

Our results from this query will be in something called an iterator. An iterator can be made into a list by casting or used in a for loop.

results_list = list(results)
print(f"Found {len(results_list)} results.")
Found 935 results.

After the results are retrieved, you can work with the records as shown in the “How do I work with computation records?” tutorial.

Querying by computation details#

If you want to query by computation specifications such as basis set, method, molecule, etc, you will need to use a more specific query methods. For example, if you want to query single point computations, you should use the query_singlepoints method. Documentation for the query_singlepoints method is shown below.

help(client.query_singlepoints)
Help on method query_singlepoints in module qcportal.client:

query_singlepoints(*, record_id: 'Optional[Union[int, Iterable[int]]]' = None, manager_name: 'Optional[Union[str, Iterable[str]]]' = None, status: 'Optional[Union[RecordStatusEnum, Iterable[RecordStatusEnum]]]' = None, dataset_id: 'Optional[Union[int, Iterable[int]]]' = None, parent_id: 'Optional[Union[int, Iterable[int]]]' = None, created_before: 'Optional[Union[datetime, str]]' = None, created_after: 'Optional[Union[datetime, str]]' = None, modified_before: 'Optional[Union[datetime, str]]' = None, modified_after: 'Optional[Union[datetime, str]]' = None, program: 'Optional[Union[str, Iterable[str]]]' = None, driver: 'Optional[Union[SinglepointDriver, Iterable[SinglepointDriver]]]' = None, method: 'Optional[Union[str, Iterable[str]]]' = None, basis: 'Optional[Union[str, Iterable[Optional[str]]]]' = None, keywords: 'Optional[Union[Dict[str, Any], Iterable[Dict[str, Any]]]]' = None, molecule_id: 'Optional[Union[int, Iterable[int]]]' = None, owner_user: 'Optional[Union[int, str, Iterable[Union[int, str]]]]' = None, owner_group: 'Optional[Union[int, str, Iterable[Union[int, str]]]]' = None, limit: 'Optional[int]' = None, include: 'Optional[Iterable[str]]' = None) -> 'RecordQueryIterator[SinglepointRecord]' method of qcportal.client.PortalClient instance
    Queries singlepoint records on the server
    
    Do not rely on the returned records being in any particular order.
    
    Parameters
    ----------
    record_id
        Query records whose ID is in the given list
    manager_name
        Query records that were completed (or are currently runnning) on a manager is in the given list
    status
        Query records whose status is in the given list
    dataset_id
        Query records that are part of a dataset is in the given list
    parent_id
        Query records that have a parent is in the given list
    created_before
        Query records that were created before the given date/time
    created_after
        Query records that were created after the given date/time
    modified_before
        Query records that were modified before the given date/time
    modified_after
        Query records that were modified after the given date/time
    program
        Query records whose program is in the given list
    driver
        Query records whose driver is in the given list
    method
        Query records whose method is in the given list
    basis
        Query records whose basis is in the given list
    keywords
        Query records with these keywords (exact match)
    molecule_id
        Query records whose molecule (id) is in the given list
    owner_user
        Query records owned by a user in the given list
    owner_group
        Query records owned by a group in the given list
    limit
        The maximum number of records to return. Note that the server limit is always obeyed.
    include
        Additional fields to include in the returned record
    
    Returns
    -------
    :
        An iterator that can be used to retrieve the results of the query

As shown in the help message above, you can query single points on many different parameters. For example, you might choose to query the database for mp2 calculations using the aug-cc-pvtz basis using the psi4 program. For the sake of demonstration in this notebook, we are limiting the number of results to 5 records.

results = client.query_singlepoints(method="mp2", basis="aug-cc-pvtz", program="psi4", limit=5)

After retrieving the results, we can loop through them and view information about the records.

for record in results:
    print(record.id, record.molecule)
28106 Molecule(name='Ne', formula='Ne', hash='1fc0bb9')
28105 Molecule(name='Be', formula='Be', hash='505b180')
28104 Molecule(name='N', formula='N', hash='74fbe8e')
28103 Molecule(name='H', formula='H', hash='dee8f82')
28102 Molecule(name='O', formula='O', hash='5bd2dfd')