Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eccodes adopt #2607

Closed
wants to merge 6 commits into from
Closed

Eccodes adopt #2607

wants to merge 6 commits into from

Conversation

marqh
Copy link
Member

@marqh marqh commented Jun 15, 2017

Enabling the adoption of EcCodes as a replacement for GribAPI

some low level implementation changes required, but they are limited and I believe supportable from the WMO Manual on Codes

some test changes

this includes GRIB capabilities for Python3, which are running in test

perhaps this should be held as a PR until the pre-release is cut, to make deployment easier

@marqh marqh added this to the v2.0 milestone Jun 15, 2017
@marqh marqh requested review from bjlittle and pp-mo June 15, 2017 17:20
@@ -54,7 +54,7 @@ install:
conda install --quiet --file minimal-conda-requirements.txt;
else
if [[ "$TRAVIS_PYTHON_VERSION" == 3* ]]; then
sed -e '/ecmwf_grib/d' -e '/esmpy/d' -e 's/#.\+$//' conda-requirements.txt | xargs conda install --quiet;
sed -e '/esmpy/d' -e 's/#.\+$//' conda-requirements.txt | xargs conda install --quiet;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marqh @bjlittle @pp-mo Can anybody explain to me what difference this makes please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nasty hack to make sure ecmwf_grib is not installed for python3, where it won't build. This is no longer required as eccodes is py3 compatible

gribapi.grib_set(grib, "DyInDegrees", float(abs(y_step)))
if x_coord.units == 'degrees':
gribapi.grib_set(grib, "iDirectionIncrement",
round(1e6 * float(abs(x_step))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marqh What is this rounding and multiplication by 1e6 for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 1e6 is a multiplier to provide values in degrees x 10 ^ -6, as per the WMO manual on codes

the round ensures proper rounding to convert to an integer

this is to ensure that the iDirectionIncrement is set correctly. this is a coded key, which are preferred by design in the iris implementation, unlike DyInDegrees which is computed. the eccodes implementation has a subtle bug in its implementation, with different behaviour between py2 and py3

we ought to follow up on this with ECMWF, but I have not yet done this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Needs raising, and would also hugely benefit from a comment.


# we currently have a known load/save
# problem with gribapi version and zero only data, where min == max.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marqh Is this still the most appropriate position for this comment to live? It seems out of place now as it doesn't seem to refer to anything which is being tested here.

I also don't really understand the comment now; which gribapi version has the problem? Is this just a placeholder for a test that will be added later when the load/save problem is addressed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps the term 'gribapi version' should just be removed from here?

@@ -34,7 +34,7 @@

if tests.GRIB_AVAILABLE:
import gribapi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marqh I don't understand why this import is still here. Is this not being replaced?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ECCodes provides a module called gribapi, it is a replacement for GRIB_API

see
https://software.ecmwf.int/wiki/display/ECC/GRIB-API+migration
for more information

thus, the import name is unchanged

@pelson pelson self-requested a review June 25, 2017 06:45
if np.min(data) == np.max(data):
msg = ('NAMEII cube #{}, "{}" has empty data : '
'SKIPPING test for this cube, as save/load will '
'not currently work with gribabi > 1v12.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs replacing with eccodes if this still holds.

grid_definition_template_4_and_5
from iris.tests import mock


MDI = _MDIs[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of the order dependence of MDI definition.

result = _unscale(value, factor)
return result


# Regulations 92.1.4 and 92.1.5.
_MDI = 2 ** 32 - 1
_MDIs = [2 ** 31 - 1, 2 ** 32 - 1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only real substantive change is this one, as far as I can see. Sorry for making you pull teeth, but are you able to talk me through why there are 2 MDIs?

The GRIB spec talks of:

92.1.4: All bits set to “1” for any value indicates that value is missing. This rule shall not apply to
packed data.

92.1.5: If applicable, negative values shall be indicated by setting the most significant bit to “1”

So I agree that for unsigned data the MDI is 2 ** 32 - 1, and for signed it is 2 ** 31 - 1, but 2 ** 31 - 1 most certainly isn't an MDI for signed. Treating it in this way cuts out the extra range that unsigned data gives and mis-represents data as MDI when it is not intended to be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be more inclined to codify the "is masked" operation as a function than have them as magic numbers in this way - it will help us avoid dtype checking all over the place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I just verify that this change is actually necessary for ecCodes integration? I can certainly imagine a situation where ecCodes now handles dtypes better and the signed/unsigned issue cropping up, but just want to confirm that it has occurred in the transition between gribapi and ecCodes...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i confirm that this is explicitly needed for ECCodes integration. the handling of missing values has changed. I believe the new behaviour is more in line with the WMO regulations, where as the coding approach in current master is based on some interpretations made by gribapi which are not preserved in ECCodes

gribapi.grib_set(grib, "DyInDegrees", float(abs(y_step)))
if x_coord.units == 'degrees':
gribapi.grib_set(grib, "iDirectionIncrement",
round(1e6 * float(abs(x_step))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Needs raising, and would also hugely benefit from a comment.

@DPeterK
Copy link
Member

DPeterK commented Jul 18, 2017

I'm 👍 for adopting eccodes over gribapi. But I think we should hold on merging this here until we have finalised how GRIB file interaction in Iris is dealt with (i.e. with reintroducing iris-grib).

@marqh
Copy link
Member Author

marqh commented Jul 20, 2017

But I think we should hold on merging this here until we have finalised how GRIB file interaction in Iris is dealt with (i.e. with reintroducing iris-grib).

as i stated here
https://groups.google.com/forum/#!topic/scitools-iris-dev/lFw2_CnaEHU
I think that there is significant benefit in banking this work, however the next steps take place. I think it is a better place to be to have adopted this change and closed off the GRIB - Dask - ECCodes work

@marqh
Copy link
Member Author

marqh commented Jul 20, 2017

@pelson

i have looked more into the missing data handling: I understand your waryness about this implementation and the multiple missing data values.

as I mentioned, some change is required due to behaviour changes in ECCodes.

I have piloted an approach using
gribapi.grib_is_missing
which has the proper links between relevant missing data value and available values, for different data types

you can see this on a branch on my repo
marqh#64

this alternative explicitly records None for missing values. I think this may be a more robust approach. There is a small work around for arrays of values, which borrows a tiny bit of logic from the current implementation. This is an edge case, in my view, with very limited scope for use and is not used in any of our integration tests

I'm happy to replace the commits on this PR with commits from that, if you think that the implementation is preferable. I have included all relevant changes on that branch, so it is ready to go, in my view, if it seems like a good option.

Although this capability offers the opportunity to adapt our save code to use gribapi.grib_set_missing I have not made any changes of this kind. There would be lots of places this would be required and no tests require it so I have left this to 'potential future work'

cheers
mark

@marqh
Copy link
Member Author

marqh commented Jul 24, 2017

i've just force updated this PR with my preferred implementation, using gribapi.grib_is_missing

you can see the previous branch at
https://github.com/marqh/iris/tree/eccodesAdoptFirstTry
in case it's useful

I think this is a cleaner approach and it is worth discussing it in situ in this PR

@marqh
Copy link
Member Author

marqh commented Jul 24, 2017

however, it seems i have upset travis, who is failing to build my PR, how frustrating

i've tried updating the install (useful anyway) to trigger this, but to no avail

any suggestions on how to ask travis very nicely to process this commit would be gratefully received

@marqh marqh closed this Jul 25, 2017
@marqh marqh reopened this Jul 25, 2017
@@ -158,7 +158,9 @@ def _unscale(v, f):

if isinstance(value, Iterable) or isinstance(factor, Iterable):
def _masker(item):
result = ma.masked_equal(item, _MDI)
numerical_mdi = 2 ** 32 - 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment here describing the logic process here, and the lack of examples or integration tests at present

@@ -459,8 +459,12 @@ def _get_key_value(self, key):
# By default these values are returned as unhelpful strings but
# we can use int representation to compare against instead.
res = gribapi.grib_get(self._message_id, key, int)
if gribapi.grib_is_missing(self._message_id, key) == 1:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_key function, with inverted order, os check, then get

add
# Implementation of Regulations 92.1.4 and 92.1.5 via eccodes.
to docstring

Copy link
Member

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the time to merge this right now, but wanted to just say that I'm OK with this in principle going in. We already know that this will need to be ported over to iris-grib, and I'm happy enough to bite the bullet and do that myself subsequently.

# to construct the masked array. The valure is transient, only in
# scope for this function.
numerical_mdi = 2 ** 32 - 1
item = [numerical_mdi if i is None else i for i in item]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be better done as:

np.ma.masked_array(item, mask=[i is None for i in item])

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe not. If this were implemented, then a mixed type underlying array would be created, with floats and None, so NumPy would have to make an object array. This would be a behaviour change, which I don't think is desired.

Implementation of Regulations 92.1.4 and 92.1.5 via ECCodes.

"""
if gribapi.grib_is_missing(self._message_id, key) == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The == 1 is superfluous, I believe.

@marqh marqh closed this Jul 27, 2017
@marqh marqh reopened this Jul 27, 2017
@marqh marqh closed this Jul 27, 2017
@marqh marqh reopened this Jul 27, 2017
# min = 0b111...110 = -(2**31 - 2)
# max = 0b011...111 = 2**31 - 1
# Use ECCodes gribapi to recognise missing value
_MDI = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we now remove this altogether?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the answer is - some of the tests assert that a thing is missing, and this is used for that purpose.
The comment for this line doesn't really help in that regard. Would you mind adding something?

return res

def keys(self):
"""Return coded keys available in this Section."""
return self._keys

def _get_value_or_missing(self, key, use_int=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for having factored this out
Rather than a binary toggle, would you mind passing the dtype of the expected result?

@pelson pelson self-assigned this Aug 16, 2017
@pelson pelson added the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Aug 16, 2017
@pelson pelson added Status: Decision Required and removed Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form labels Oct 24, 2017
@pelson pelson removed this from the v2.0.0 milestone Oct 27, 2017
@pp-mo
Copy link
Member

pp-mo commented Oct 27, 2017

I've now ported all the changes here to iris_grib : SciTools/iris-grib#92

That is, except for some outstanding integration tests which are still in Iris.
Those bits are in #2887

@pelson
Copy link
Member

pelson commented Oct 28, 2017

Just to say: thanks for doing this @marqh, and thanks to @pp-mo for migrating it over to iris-grib.
👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants