Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📚 DOCS: Add example for retrieve_temporary_list #5157

Merged
merged 2 commits into from
Oct 4, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 25 additions & 4 deletions docs/source/topics/calculations/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -518,17 +518,38 @@ The target relative path is also compatible with glob patterns in the source rel

Retrieve temporary list
~~~~~~~~~~~~~~~~~~~~~~~

Recall that, as explained in the :ref:`'prepare' section<topics:calculations:usage:calcjobs:prepare>`, all the files that are retrieved by the engine following the 'retrieve list', are stored in the ``retrieved`` folder data node.
This means that any file you retrieve for a completed calculation job will be stored in your repository.
If you are retrieving big files, this can cause your repository to grow significantly.
Often, however, you might only need a part of the information contained in these retrieved files.
To solve this common issue, there is the concept of the 'retrieve temporary list'.
The specification of the retrieve temporary list is identical to that of the normal :ref:`retrieve list<topics:calculations:usage:calcjobs:file_lists_retrieve>`.
The specification of the retrieve temporary list is identical to that of the normal :ref:`retrieve list<topics:calculations:usage:calcjobs:file_lists_retrieve>`, but it is added to the ``calc_info`` under the ``retrieve_temporary_list`` attribute:

.. code-block:: python

calcinfo = CalcInfo()
calcinfo.retrieve_temporary_list = ['relative/path/to/file.txt']

The only difference is that, unlike the files of the retrieve list which will be permanently stored in the retrieved :py:class:`~aiida.orm.nodes.data.folder.FolderData` node, the files of the retrieve temporary list will be stored in a temporary sandbox folder.
This folder is then passed to the :ref:`parser<topics:calculations:usage:calcjobs:parsers>`, if one was specified for the calculation job.
This folder is then passed under the ``retrieved_temporary_folder`` keyword argument to the ``parse`` method of the :ref:`parser<topics:calculations:usage:calcjobs:parsers>`, if one was specified for the calculation job:

.. code-block:: python

def parse(self, **kwargs):
"""Parse the retrieved files of the calculation job."""

retrieved_temporary_folder = kwargs['retrieved_temporary_folder']
mbercx marked this conversation as resolved.
Show resolved Hide resolved

The parser implementation can then parse these files and store the relevant information as output nodes.
After the parser terminates, the engine will take care to automatically clean up the sandbox folder with the temporarily retrieved files.
The contract of the 'retrieve temporary list' is essentially that the files will be available during parsing and will be destroyed immediately afterwards.

.. important::

The type of ``kwargs['retrieved_temporary_folder']`` is a simple ``str`` that represents the `absolute` filepath to the temporary folder.
You can access its contents with the ``os`` standard library module or convert it into a ``pathlib.Path``.

After the parser terminates, the engine will automatically clean up the sandbox folder with the temporarily retrieved files.
The concept of the ``retrieve_temporary_list`` is essentially that the files will be available during parsing and will be destroyed immediately afterwards.

.. _topics:calculations:usage:calcjobs:stashing:

Expand Down