Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master' into fix_hybrid_height
Browse files Browse the repository at this point in the history
  • Loading branch information
schlunma committed Apr 1, 2020
2 parents 0e7a776 + 8e694c0 commit 995d674
Show file tree
Hide file tree
Showing 16 changed files with 834 additions and 76 deletions.
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -98,5 +98,5 @@ license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://github.com/ESMValGroup/ESMValCore/"
title: ESMValCore
version: "v2.0.0b8"
version: "v2.0.0b9"
...
26 changes: 25 additions & 1 deletion doc/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,29 @@
Changelog
=========

v2.0.0b9
--------

For older releases, see the release notes on https://github.com/ESMValGroup/ESMValCore/releases.
This release includes

Bug fixes
~~~~~~~~~

- Cast dtype float32 to output from zonal and meridional area preprocessors (`#581 <https://github.com/ESMValGroup/ESMValCore/pull/581>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__

Improvements
~~~~~~~~~~~~

- Unpin on Python<3.8 for conda package (run) (`#570 <https://github.com/ESMValGroup/ESMValCore/pull/570>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Update pytest installation marker (`#572 <https://github.com/ESMValGroup/ESMValCore/pull/572>`__) `Bouwe Andela <https://github.com/bouweandela>`__
- Remove vmrh2o (`#573 <https://github.com/ESMValGroup/ESMValCore/pull/573>`__) `Mattia Righi <https://github.com/mattiarighi>`__
- Restructure documentation (`#575 <https://github.com/ESMValGroup/ESMValCore/pull/575>`__) `Bouwe Andela <https://github.com/bouweandela>`__
- Fix mask in land variables for CCSM4 (`#579 <https://github.com/ESMValGroup/ESMValCore/pull/579>`__) `Klaus Zimmermann <https://github.com/zklaus>`__
- Fix derive scripts wrt required method (`#585 <https://github.com/ESMValGroup/ESMValCore/pull/585>`__) `Klaus Zimmermann <https://github.com/zklaus>`__
- Check coordinates do not have repeated standard names (`#558 <https://github.com/ESMValGroup/ESMValCore/pull/558>`__) `Javier Vegas-Regidor <https://github.com/jvegasbsc>`__
- Added derivation script for co2s (`#587 <https://github.com/ESMValGroup/ESMValCore/pull/587>`__) `Manuel Schlund <https://github.com/schlunma>`__
- Adapted custom co2s table to match CMIP6 version (`#588 <https://github.com/ESMValGroup/ESMValCore/pull/588>`__) `Manuel Schlund <https://github.com/schlunma>`__
- Increase version to v2.0.0b9 (`#593 <https://github.com/ESMValGroup/ESMValCore/pull/593>`__) `Bouwe Andela <https://github.com/bouweandela>`__
- Add a method to save citation information (`#402 <https://github.com/ESMValGroup/ESMValCore/pull/402>`__) `SarahAlidoost <https://github.com/SarahAlidoost>`__

For older releases, see the release notes on https://github.com/ESMValGroup/ESMValCore/releases.
247 changes: 247 additions & 0 deletions esmvalcore/_citation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
"""Citation module."""
import logging
import os
import re
import textwrap
from functools import lru_cache

import requests

from ._config import DIAGNOSTICS_PATH

logger = logging.getLogger(__name__)

REFERENCES_PATH = DIAGNOSTICS_PATH / 'references'

CMIP6_URL_STEM = 'https://cera-www.dkrz.de/WDCC/ui/cerasearch'

# The technical overview paper should always be cited
ESMVALTOOL_PAPER = (
"@article{righi20gmd,\n"
"\tdoi = {10.5194/gmd-13-1179-2020},\n"
"\turl = {https://doi.org/10.5194/gmd-13-1179-2020},\n"
"\tyear = {2020},\n"
"\tmonth = mar,\n"
"\tpublisher = {Copernicus {GmbH}},\n"
"\tvolume = {13},\n"
"\tnumber = {3},\n"
"\tpages = {1179--1199},\n"
"\tauthor = {Mattia Righi and Bouwe Andela and Veronika Eyring "
"and Axel Lauer and Valeriu Predoi and Manuel Schlund "
"and Javier Vegas-Regidor and Lisa Bock and Bj\"{o}rn Br\"{o}tz "
"and Lee de Mora and Faruk Diblen and Laura Dreyer "
"and Niels Drost and Paul Earnshaw and Birgit Hassler "
"and Nikolay Koldunov and Bill Little and Saskia Loosveldt Tomas "
"and Klaus Zimmermann},\n"
"\ttitle = {Earth System Model Evaluation Tool (ESMValTool) v2.0 "
"-- technical overview},\n"
"\tjournal = {Geoscientific Model Development}\n"
"}\n")


def _write_citation_files(filename, provenance):
"""
Write citation information provided by the recorded provenance.
Recipe and cmip6 data references are saved into one bibtex file.
cmip6 data references are provided by CMIP6 data citation service.
Each cmip6 data reference has a json link. In the case of internet
connection, cmip6 data references are saved into a bibtex file.
Also, cmip6 data reference links are saved into a text file.
"""
product_name = os.path.splitext(filename)[0]

tags = set()
cmip6_json_urls = set()
cmip6_info_urls = set()
other_info = set()

for item in provenance.records:
# get cmip6 data citation info
cmip6_data = 'CMIP6' in item.get_attribute('attribute:mip_era')
if cmip6_data:
url_prefix = _make_url_prefix(item.attributes)
cmip6_info_urls.add(_make_info_url(url_prefix))
cmip6_json_urls.add(_make_json_url(url_prefix))

# get other citation info
references = item.get_attribute('attribute:references')
if not references:
# ESMValTool CMORization scripts use 'reference' (without final s)
references = item.get_attribute('attribute:reference')
if references:
if item.identifier.namespace.prefix == 'recipe':
# get recipe citation tags
tags.update(references)
elif item.get_attribute('attribute:script_file'):
# get diagnostics citation tags
tags.update(references)
elif not cmip6_data:
# get any other data citation tags, e.g. CMIP5
other_info.update(references)

_save_citation_bibtex(product_name, tags, cmip6_json_urls)
_save_citation_info_txt(product_name, cmip6_info_urls, other_info)


def _save_citation_bibtex(product_name, tags, json_urls):
"""Save the bibtex entries in a bibtex file."""
citation_entries = [ESMVALTOOL_PAPER]

# convert tags to bibtex entries
if tags:
entries = set()
for tag in _extract_tags(tags):
entries.add(_collect_bibtex_citation(tag))
citation_entries.extend(sorted(entries))

# convert json_urls to bibtex entries
entries = set()
for json_url in json_urls:
cmip_citation = _collect_cmip_citation(json_url)
if cmip_citation:
entries.add(cmip_citation)
citation_entries.extend(sorted(entries))

with open(f'{product_name}_citation.bibtex', 'w') as file:
file.write('\n'.join(citation_entries))


def _save_citation_info_txt(product_name, info_urls, other_info):
"""Save all data citation information in one text file."""
lines = []
# Save CMIP6 url_info
if info_urls:
lines.append(
"Follow the links below to find more information about CMIP6 data:"
)
lines.extend(f'- {url}' for url in sorted(info_urls))

# Save any references from the 'references' and 'reference' NetCDF global
# attributes.
if other_info:
if lines:
lines.append('')
lines.append("Additional data citation information was found, for "
"which no entry is available in the bibtex file:")
lines.extend('- ' + str(t).replace('\n', ' ')
for t in sorted(other_info))

if lines:
with open(f'{product_name}_data_citation_info.txt', 'w') as file:
file.write('\n'.join(lines) + '\n')


def _extract_tags(tags):
"""Extract tags.
Tags are recorded as a list of strings converted to a string in provenance.
For example, a single entry in the list `tags` could be the string
"['acknow_project', 'acknow_author']".
"""
pattern = re.compile(r'\w+')
return set(pattern.findall(str(tags)))


def _get_response(url):
"""Return information from CMIP6 Data Citation service in json format."""
json_data = None
if url.lower().startswith('https'):
try:
response = requests.get(url)
if response.status_code == 200:
json_data = response.json()
else:
logger.warning('Error in the CMIP6 citation link: %s', url)
except IOError:
logger.info('No network connection, '
'unable to retrieve CMIP6 citation information')
return json_data


def _json_to_bibtex(data):
"""Make a bibtex entry from CMIP6 Data Citation json data."""
url = 'url not found'
title = data.get('titles', ['title not found'])[0]
publisher = data.get('publisher', 'publisher not found')
year = data.get('publicationYear', 'publicationYear not found')
authors = 'creators not found'
doi = 'doi not found'

if 'creators' in data:
author_list = [
item.get('creatorName', '') for item in data['creators']
]
authors = ' and '.join(author_list)
if not authors:
authors = 'creators not found'

if 'identifier' in data:
doi = data['identifier'].get('id', 'doi not found')
url = f'https://doi.org/{doi}'

bibtex_entry = textwrap.dedent(f"""
@misc{{{url},
\turl = {{{url}}},
\ttitle = {{{title}}},
\tpublisher = {{{publisher}}},
\tyear = {year},
\tauthor = {{{authors}}},
\tdoi = {{{doi}}},
}}
""").lstrip()
return bibtex_entry


@lru_cache(maxsize=1024)
def _collect_bibtex_citation(tag):
"""Collect information from bibtex files."""
bibtex_file = REFERENCES_PATH / f'{tag}.bibtex'
if bibtex_file.is_file():
entry = bibtex_file.read_text()
else:
entry = ''
logger.warning(
"The reference file %s does not exist, citation information "
"incomplete.", bibtex_file)
return entry


@lru_cache(maxsize=1024)
def _collect_cmip_citation(json_url):
"""Collect information from CMIP6 Data Citation Service."""
json_data = _get_response(json_url)
if json_data:
bibtex_entry = _json_to_bibtex(json_data)
else:
bibtex_entry = ''
return bibtex_entry


def _make_url_prefix(attribute):
"""Make url prefix based on CMIP6 Data Citation Service."""
# the order of keys is important
localpart = {
'mip_era': '',
'activity_id': '',
'institution_id': '',
'source_id': '',
'experiment_id': '',
}
for key, value in attribute:
if key.localpart in localpart:
localpart[key.localpart] = value
url_prefix = '.'.join(localpart.values())
return url_prefix


def _make_json_url(url_prefix):
"""Make json url based on CMIP6 Data Citation Service."""
json_url = f'{CMIP6_URL_STEM}/cerarest/exportcmip6?input={url_prefix}'
return json_url


def _make_info_url(url_prefix):
"""Make info url based on CMIP6 Data Citation Service."""
info_url = f'{CMIP6_URL_STEM}/cmip6?input={url_prefix}'
return info_url
2 changes: 1 addition & 1 deletion esmvalcore/_provenance.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def get_recipe_provenance(documentation, filename):
entity = provenance.entity(
'recipe:{}'.format(filename), {
'attribute:description': documentation.get('description', ''),
'attribute:references': ', '.join(
'attribute:references': str(
documentation.get('references', [])),
})

Expand Down
2 changes: 1 addition & 1 deletion esmvalcore/_recipe.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,7 @@ def _get_default_settings(variable, config_user, derive=False):
settings['load'] = {
'callback': concatenate_callback,
}
# Configure merge
# Configure concatenation
settings['concatenate'] = {}

# Configure fixes
Expand Down
2 changes: 2 additions & 0 deletions esmvalcore/_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import psutil
import yaml

from ._citation import _write_citation_files
from ._config import DIAGNOSTICS_PATH, TAGS, replace_tags
from ._provenance import TrackedFile, get_task_provenance

Expand Down Expand Up @@ -565,6 +566,7 @@ def _collect_provenance(self):
product = TrackedFile(filename, attributes, ancestors)
product.initialize_provenance(self.activity)
product.save_provenance()
_write_citation_files(product.filename, product.provenance)
self.products.add(product)
logger.debug("Collecting provenance of task %s took %.1f seconds",
self.name,
Expand Down
2 changes: 1 addition & 1 deletion esmvalcore/_version.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
"""ESMValCore version."""
__version__ = '2.0.0b8'
__version__ = '2.0.0b9'
7 changes: 4 additions & 3 deletions esmvalcore/cmor/tables/custom/CMOR_co2s.dat
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,11 @@ modeling_realm: atmos
! Variable attributes:
!----------------------------------
standard_name: mole_fraction_of_carbon_dioxide_in_air
units: mol mol-1
cell_methods: time: mean
units: 1e-06
cell_methods: area: time: mean
cell_measures: area: areacella
long_name: Mole Fraction of CO2 at surface level
long_name: Atmosphere CO2
comment: As co2, but only at the surface
!----------------------------------
! Additional variable information:
!----------------------------------
Expand Down
Loading

0 comments on commit 995d674

Please sign in to comment.