Add MiRS reader #1511

joleenf · 2021-01-22T18:05:20Z

This PR is a replacement for PRs #1486 and #1285. This MiRS reader loads the level 2 EDR IMG swath files produced by the Microwave Integrated Retrieval System.

The location of coefficient files still needs to be determined and code needs to be updated as appropriate.

…ytroll#1486 PR

ghost · 2021-01-22T18:05:35Z

DeepCode's analysis on #e2bdd8 found:

⚠️ 1 warning, ℹ️ 2 minor issues. 👇

Top issues

Description	Example fixes
Missing close for open, add close or use a with block. Occurrences: mirs.py:83	🔧 Example fixes
The call to next should be guarded with a try/except block Occurrences: mirs.py:117 mirs.py:119 mirs.py:124	🔧 Example fixes
Use predictable random with seed or secure random. Occurrences: test_mirs.py:71	🔧 Example fixes

👉 View analysis in DeepCode’s Dashboard | Configure the bot

codecov · 2021-01-22T18:14:10Z

Codecov Report

Merging #1511 (db8bcef) into master (e344613) will increase coverage by 0.13%.
The diff coverage is 86.85%.

@@            Coverage Diff             @@
##           master    #1511      +/-   ##
==========================================
+ Coverage   92.59%   92.73%   +0.13%     
==========================================
  Files         251      253       +2     
  Lines       36997    37576     +579     
==========================================
+ Hits        34258    34845     +587     
+ Misses       2739     2731       -8

Flag	Coverage Δ
behaviourtests	`4.42% <2.09%> (-0.05%)`	⬇️
unittests	`92.87% <86.85%> (+0.13%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
satpy/composites/crefl_utils.py	`84.52% <ø> (ø)`
satpy/composites/viirs.py	`86.54% <0.00%> (ø)`
satpy/readers/ghrsst_l3c_sst.py	`7.24% <0.00%> (ø)`
satpy/readers/hrpt.py	`98.62% <ø> (ø)`
satpy/readers/msi_safe.py	`2.29% <ø> (ø)`
satpy/readers/seviri_base.py	`95.18% <ø> (ø)`
satpy/readers/seviri_l1b_native_hdr.py	`100.00% <ø> (ø)`
satpy/readers/xmlformat.py	`89.65% <ø> (ø)`
satpy/tests/reader_tests/test_iasi_l2_so2_bufr.py	`97.18% <ø> (ø)`
satpy/tests/test_modifiers.py	`100.00% <ø> (ø)`
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e344613...e2bdd8b. Read the comment docs.

djhoese · 2021-01-22T18:20:24Z

I've updated the title just because we use it in our release notes so didn't want to clutter it with the PR numbers.

This is done because the reader name in polar2grid is mirs, and that has been used for a while, so it should not change for users

djhoese

A couple requests (mostly style stuff), but otherwise it is looking pretty good. Also 👍 for the pytest parametrize usage.

satpy/readers/mirs.py

Remove variables that are never used There seems to be no reason to handle coords again in __getitem__, let xarray handle these.

satpy/etc/readers/mirs.yaml

satpy/readers/mirs.py

1.) Do not need the file registry before running pooch retrieve command 2.) Simplify/make easier to read code that does the pooch retrieve and call when needed, not at the beginning of the code. The filenames do not need to be in the metadata for the scene. 3.) Use dask.where rather than roundabout numpy logic. This eliminates all squeeze calls. 4.) removes bt_data array slice where it is not needed but does use the bt_data attributes for the xarray of the corrected brightness temperatures

satpy/etc/readers/mirs.yaml

satpy/readers/mirs.py

limb_correct_bt does not need to be in the class. Move code out of class methods and restructure the call to the limb_correction to facilitate this move.

Assigning a map_blocks call to a numpy array 96 times causes dask to compute before it is necessary. Also, I could be wrong, by it does not seem necessary to loop over map_blocks. It makes more sense to loop over the fov in the function that map_blocks calls.

Merge branch 'master' of https://github.com/pytroll/satpy into mirs

…py arrays also, don't hard code the block size for the arrays, use the block size of the input daskArray.

…ays.

djhoese · 2021-02-24T16:46:36Z

satpy/readers/mirs.py

-    bt_corrected = xr.DataArray(new_data, dims=bt_data[idx, :, :].dims,
-                                coords=bt_data[idx, :, :].coords,
+    bt_corrected = xr.DataArray(new_data, dims=("y", "x"),
+                                coords=surf_type_mask.coords,


What are the coords this variable has?

I was thinking about this and am not sure the best way to add the coords. I was considering adjusting the way the method new_coords works at line 211 to make this a more obvious way of assigning longitude and latitude coordinates.

Actually was thinking about that same thing for the chunk size for the map_blocks. If I added a general self.nc.shape so that I can access the scanlines/fov without slicing the bt_data.

Let's talk about it at the meeting today BUT you shouldn't need to add lon/lat/x/y coordinates. The base reader class will do that for you based on the coordinates: defined in the YAML (or your available_datasets method).

1.) Remove extra check for self.nc.coords and just check within new_coords method 2.) Change all dask calls in apply_limb_correction method to np calls.

This feature of xarray is not working well with dask pydata/xarray#3068

1.) Removes the new_coords method which added lat/lon coords through xarray.Dataset.assign_coords method. 2.) Removes coordinate assignment when bt_corrected xarray.DataArrays are constructed. 3.) Restructures available_datasets: Hope, it is easier to understand and ensures that lat/lon coordinates have standard_names even if file metdata without it takes precedence after the yaml.

otherwise they are dim_0 and dim_1 and they are not recognized by the reader. Scanline/FOV are converted to y,x in reader

djhoese · 2021-02-27T15:36:10Z

@djhoese In this commit, the file metadata is taking precedence over metadata in the yaml for the metop file used for testing. (IMG_SX.M2.D17037.S1601.E1607.B0000001.WE.HR.ORB.nc). Therefore, standard_name is added when reading the file metadata. I thought this line in the available_datasets meant that it is possible to append new information to the metadata:

This method is not guaranteed that it will be called before any other file type’s handler. The availability “boolean” not being None does not mean that a file handler called later can’t provide an additional dataset, but it must provide more identifying (DataID) information to do so and should yield its new dataset in addition to the previous one.

Can the boolean be True, False or None? Or is it only either True or None? I think the available_dataset() information above is saying, "Even if the availability "boolean" is True rather than None, a file handler could provide additional datasets." After that line, what is the more identifying DataID? Is this referring to the more definitive DataIDs like Brightness Temperature data defined by frequency and polarization "btemp_23v" versus the wider scope of BT? If so, then that is working correctly in this reader, because both the BT and the definitive brightness temperature channels are being yielded. Is there a way to keep variable metadata from a yaml and use metadata from a file, or is it either/or?
Also, in "Migrating to Xarray" the example shows using the "coord" attribute. Is assign_coords the only concern in this issue report?

I think it can be True/False/None. I'm not sure I know or remember exactly what you're talking about and am missing some of the context of this comment/discussion. I think you have it right though. The available_datasets method is not supposed to stop/limit/remove datasets that have been "reported" by other file handlers. There is also no order to which file handler's available_datasets is called first so that why it has to be very passive.

Regarding coords, it isn't how they are assigned, it is what is being assigned (a 2D dask array). Until we can get more confidence that xarray won't "accidentally" compute these, it would be best not to include them as .coords.

joleenf · 2021-02-27T19:46:48Z

I am sorry Dave, I meant to put a link for the available_datasets section and did not realize I hadn't done that. Sorry for the confusion. I am pretty sure that any file metadata in the mirs reader supersedes metadata read from the yaml. Perhaps I have missed a basic concept in building the datasets through the configuration file and dynamic datasets.

djhoese · 2021-02-27T23:36:35Z

I think you based this reader off of the GAASP reader which probably makes some assumptions about where dataset definitions are coming from. I think you are right, but technically the behavior is probably "undefined" when two datasets are yielded with the same DataID. I think you'll either need to keep a "cache" of what DataIDs have already been yielded by the file handler and not re-yield/override it. I think I've used a list/set called _handled in other reader's file handlers. Or pass each sub-generator (the 3 methods you call and yield from in your available_datasets) to the next so you can do more checks but it may be more efficient to move the initial for loop outside these generators.

Obviously, making it work first is more important.

joleenf · 2021-03-01T15:28:01Z

I think you based this reader off of the GAASP reader

Correct, I based this on the GAASP reader and it does make assumptions about the location of Metdata, mainly, I don't think it expects any of the metadata to be coming from the yaml. Currently, that is also how the mirs reader is working. Initially, I thought I would use the yaml to add metadata that was missing, but then I realized, that could be a tricky thing. I would then want to make sure that the file metadata would always override metadata from the yaml like units. I am comfortable with the idea that all the metadata comes from the file and adding things like long_name and standard_name with in the reader.
I have yet another question about the xarray coords attribute. If I am using open_dataset, doesn't this assign the coords to the xarray.DataArray just as if I had used assign_coords or even built an xarray.DataArray using a coords attribute? So, we are not clear from the the unexpected computation issue?

djhoese · 2021-03-01T15:34:35Z

Yes, it should be adding coords as it (xarray) sees fit. You could .reset_coords if you want to force the removal of them. 🤔 Just leave the coords as they are then. We can't know what issues they will cause unless we keep them there so maybe keeping them there is the best choice here.

djhoese · 2021-03-03T14:53:11Z

satpy/readers/mirs.py

+    bt_data = bt_data.transpose("Channel", "y", "x")
+    c_size = bt_data[idx, :, :].chunks
+    correction = da.map_blocks(apply_atms_limb_correction,
+                               bt_data.values, idx,


The .values should be removed here, right?

The necessary metadata to get metop files read into satpy is getting added in the reader. The information provided to the yaml would be nice to supplement information in the file, but I can't think of a reason why it should override information, except for descriptions, which can be handled elsewhere. Since trying to juggle yaml/file metadata would be a little more involved in the code, I feel it is best to keep the model of the gaasp reader which assumes the necessary metadata is in the file and add missing metadata within the reader.

- add the mocks - create fake coefficient data in test - updates how the coefficients are read in mirs by splitting our the reading of the file and the actual parsing. - Checks the end of coefficient file for n_chn and n_fov thus removing the need for globals - Remove the global N_FOV in the loop which calculates the coefficients on the data. Use dataset size for loop endpoint, if the dataset for some reason does not have 96 fov, this loop would fail with a fixed endpoint.

djhoese

This looks good enough to me. Thanks for the patience on getting things in order and waiting on the aux data download stuff.

Add files for MiRS reader takes over for pytroll#1285 PR and replaces p…

1bd35fb

…ytroll#1486 PR

djhoese changed the title ~~Add files for MiRS reader takes over for #1285 PR and replaces #1486 PR~~ Add MiRS reader Jan 22, 2021

djhoese mentioned this pull request Jan 26, 2021

Add auxiliary data download API #1513

Merged

4 tasks

joleenf added 5 commits January 27, 2021 06:40

Remove unused lines

0b102c4

Merge branch 'master' of https://github.com/pytroll/satpy into mirs

b5613fd

Change name of mirs_l2_nc to mirs

2c7b733

This is done because the reader name in polar2grid is mirs, and that has been used for a while, so it should not change for users

Merge branch 'master' of https://github.com/pytroll/satpy into mirs

59c32b3

Change name of mirs_l2_nc to mirs within code and yaml

95852a9

djhoese requested changes Feb 16, 2021

View reviewed changes

joleenf added 4 commits February 16, 2021 14:06

Merge branch 'master' of https://github.com/pytroll/satpy into mirs

a0386b6

Remove code that is not used

7064d39

Remove variables that are never used There seems to be no reason to handle coords again in __getitem__, let xarray handle these.

Add code to support zenodo data download with pooch

29576d9

Make counter names more unique and descriptive

188f609

djhoese reviewed Feb 18, 2021

View reviewed changes

satpy/etc/readers/mirs.yaml Show resolved Hide resolved

djhoese reviewed Feb 18, 2021

View reviewed changes

satpy/readers/mirs.py Outdated Show resolved Hide resolved

djhoese reviewed Feb 18, 2021

View reviewed changes

satpy/readers/mirs.py Outdated Show resolved Hide resolved

joleenf added 3 commits February 19, 2021 06:39

Add known_hash to zenodo files.

9c6b0d1

yield BT metadata in it's own generator.

aea1e9e

djhoese reviewed Feb 19, 2021

View reviewed changes

satpy/etc/readers/mirs.yaml Outdated Show resolved Hide resolved

djhoese reviewed Feb 19, 2021

View reviewed changes

satpy/readers/mirs.py Outdated Show resolved Hide resolved

joleenf added 6 commits February 19, 2021 10:46

Remove unused variables and fix some lines DeepCode doesn't like

935b518

Move limb_correct_bt

45ce5f3

limb_correct_bt does not need to be in the class. Move code out of class methods and restructure the call to the limb_correction to facilitate this move.

BUG FIX (filename not file, but don't need it anyway)

7fa7143

Omit extra line and just return da.stack

f526e99

Remove else return statement

40c0542

joleenf added 4 commits February 21, 2021 06:20

Merge upstream commits

874dd7f

Merge branch 'master' of https://github.com/pytroll/satpy into mirs

BUG FIX: function that map_blocks calls should receive and return num…

0e871d4

…py arrays also, don't hard code the block size for the arrays, use the block size of the input daskArray.

Merge branch 'master' of https://github.com/pytroll/satpy into mirs

ac4bccc

Be more direct in assigning dims, and name to new corrected btemp arr…

f710028

…ays.

djhoese reviewed Feb 24, 2021

View reviewed changes

joleenf added 4 commits February 25, 2021 10:40

Remove dask calls in function that map_blocks calls and reorder code

0cb3280

1.) Remove extra check for self.nc.coords and just check within new_coords method 2.) Change all dask calls in apply_limb_correction method to np calls.

Remove xarray coord assignment for corrected bt_data

3e5bb3e

This feature of xarray is not working well with dask pydata/xarray#3068

Add Scanline/FOV dims to RR

7f68d6c

otherwise they are dim_0 and dim_1 and they are not recognized by the reader. Scanline/FOV are converted to y,x in reader

joleenf mentioned this pull request Feb 27, 2021

[WIP] Add MIRS file reader #1285

Closed

5 tasks

joleenf marked this pull request as ready for review February 27, 2021 13:24

joleenf requested a review from mraspaud as a code owner February 27, 2021 13:24

djhoese reviewed Mar 3, 2021

View reviewed changes

joleenf added 4 commits March 3, 2021 10:30

Fix computation of array being sent to through map_blocks

186f53d

Merge branch 'master' of https://github.com/pytroll/satpy into mirs

0de4ad2

Fixes to lat/lon dims and data

290fcdc

djhoese mentioned this pull request Mar 5, 2021

Don't allow auxiliary downloads during tests #1587

Closed

joleenf added 2 commits March 5, 2021 15:32

Remove extra code that only adds one common line to the xarray.Datasets

e2bdd8b

djhoese approved these changes Mar 9, 2021

View reviewed changes

djhoese merged commit f56a089 into pytroll:master Mar 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MiRS reader #1511

Add MiRS reader #1511

joleenf commented Jan 22, 2021

ghost commented Jan 22, 2021 •

edited by ghost

Loading

codecov bot commented Jan 22, 2021 •

edited

Loading

djhoese commented Jan 22, 2021

djhoese left a comment

djhoese Feb 24, 2021

joleenf Feb 24, 2021

joleenf Feb 24, 2021

djhoese Feb 24, 2021

djhoese commented Feb 27, 2021

joleenf commented Feb 27, 2021

djhoese commented Feb 27, 2021

joleenf commented Mar 1, 2021

djhoese commented Mar 1, 2021

djhoese Mar 3, 2021

djhoese left a comment

Add MiRS reader #1511

Add MiRS reader #1511

Conversation

joleenf commented Jan 22, 2021

ghost commented Jan 22, 2021 • edited by ghost Loading

DeepCode's analysis on #e2bdd8 found:

Top issues

👉 View analysis in DeepCode’s Dashboard | Configure the bot

codecov bot commented Jan 22, 2021 • edited Loading

Codecov Report

djhoese commented Jan 22, 2021

djhoese left a comment

Choose a reason for hiding this comment

djhoese Feb 24, 2021

Choose a reason for hiding this comment

joleenf Feb 24, 2021

Choose a reason for hiding this comment

joleenf Feb 24, 2021

Choose a reason for hiding this comment

djhoese Feb 24, 2021

Choose a reason for hiding this comment

djhoese commented Feb 27, 2021

joleenf commented Feb 27, 2021

djhoese commented Feb 27, 2021

joleenf commented Mar 1, 2021

djhoese commented Mar 1, 2021

djhoese Mar 3, 2021

Choose a reason for hiding this comment

djhoese left a comment

Choose a reason for hiding this comment

ghost commented Jan 22, 2021 •

edited by ghost

Loading

codecov bot commented Jan 22, 2021 •

edited

Loading