Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-1657: Make ivm default weight_type for resample #5738

Merged

Conversation

jdavies-st
Copy link
Collaborator

@jdavies-st jdavies-st commented Feb 11, 2021

This adds weight_type="ivm" as the default weight scheme for ResampleStep.

This will obviously cause changes in the _s2d.fits, _i2d.fits, _x1d.fits and _cat.ecsv outputs in our regression tests. Further, some of the input _rate.fits files (NIRSpec MOS) do not have var_rnoise, so when it is accessed, it is all zeros, which gives zero weight to everything in the output. So those will have to have dummy var_rnoise arrays added to the input files.

Resolves #800 / JP-1657

@codecov
Copy link

codecov bot commented Feb 11, 2021

Codecov Report

Merging #5738 (c283048) into master (16ee103) will increase coverage by 0.02%.
The diff coverage is 89.28%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5738      +/-   ##
==========================================
+ Coverage   75.92%   75.95%   +0.02%     
==========================================
  Files         406      406              
  Lines       36255    36254       -1     
==========================================
+ Hits        27528    27536       +8     
+ Misses       8727     8718       -9     
Flag Coverage Δ *Carryforward flag
nightly 75.94% <76.00%> (+<0.01%) ⬆️ Carriedforward from a04b82c
unit 54.62% <83.33%> (+0.02%) ⬆️

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
jwst/outlier_detection/outlier_detection.py 85.06% <ø> (ø)
jwst/outlier_detection/outlier_detection_scaled.py 20.51% <ø> (ø)
...outlier_detection/outlier_detection_scaled_step.py 28.57% <ø> (ø)
jwst/outlier_detection/outlier_detection_spec.py 73.91% <ø> (ø)
.../outlier_detection/outlier_detection_stack_step.py 27.27% <ø> (ø)
jwst/outlier_detection/outlier_detection_step.py 86.91% <ø> (ø)
jwst/resample/resample.py 96.19% <ø> (ø)
jwst/resample/resample_spec.py 98.22% <ø> (ø)
jwst/resample/resample_step.py 87.37% <50.00%> (+1.39%) ⬆️
jwst/resample/gwcs_drizzle.py 78.26% <60.00%> (+3.26%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 16ee103...c283048. Read the comment docs.

@hbushouse hbushouse added this to the Build 7.7.x milestone Feb 12, 2021
@jdavies-st jdavies-st changed the title Make ivm default weight_type for resample JP-1657: Make ivm default weight_type for resample Feb 17, 2021
@jdavies-st jdavies-st marked this pull request as ready for review February 17, 2021 23:36
@jdavies-st jdavies-st force-pushed the JP-1657-resample-ivm-weighting branch from 971becd to d151abe Compare February 17, 2021 23:46
Copy link
Collaborator

@hbushouse hbushouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you ever figure out why the current WHT arrays looked like they had Poisson noise included, even though it was supposed to be weighting by only exptime?

@@ -164,8 +163,7 @@ def do_drizzle(self):
inwht = resample_utils.build_driz_weight(img,
weight_type=self.weight_type,
good_bits=self.good_bits)
driz.add_image(img.data, img.meta.wcs, inwht=inwht,
expin=img.meta.exposure.exposure_time)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So given that exptime is still an allowed choice for weighting (just no longer the default), what happens when someone selects exptime but the exposure_time is no longer being passed to the lower-level functions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Let me look at my unit tests and see what I missed.

That raises the larger question, should we even allow exposure time to be used? It clearly gives the wrong answer for multi-read up-the-ramp data, so maybe it should just be completely removed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what's the verdict here? Does it still work when selecting "exptime" weighting?

# elif weight_type == 'ivm':
# _inwht = img.buildIVMmask(chip._chip,dqarr,pix_ratio)
if weight_type == 'ivm':
with np.errstate(divide="ignore", invalid="ignore"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, nice. I have to remember to use this in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, useful! Though it has it's pitfalls. I only did it after I made sure the arrays looked how I expected them to, as the warning messages were telling me something. For instance, dividing by zero variance produced np.inf, but the original code here that used np.nan_to_num then turned that np.inf into:

In [2]: np.nan_to_num(np.inf)                                                                                           
Out[2]: 1.7976931348623157e+308

which is clearly wrong. Hence the call below to np.isfinite.

from jwst.assign_wcs import AssignWcsStep
from jwst.extract_2d import Extract2dStep
from jwst.resample import ResampleSpecStep, ResampleStep

@pytest.fixture
def nirspec_rate():
shape = (2048, 2048)
ysize = 2048
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I thought we were supposed to keep test images small?!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to stay large because it is NIRSpec for some reason? I didn't change it from the original test that was put here (not by me). It still runs fast, because it is not doing any jump detection or ramp fitting.

@jdavies-st
Copy link
Collaborator Author

As mentioned above, some of our input regression test data is very old and predates the population of var_rnoise arrays in ramp fitting. This is the case for the simulated NIRSpec MOS data we have and a couple other datasets. For real data, there will always be var_rnoise. For this old data, is mocking it up with ones acceptable? Any other ideas?

Without mocking it up, the var_rnoise attributes gets auto-created with zeros, and then everything has zero weight and you get no data in the output.

@jdavies-st
Copy link
Collaborator Author

Did you ever figure out why the current WHT arrays looked like they had Poisson noise included, even though it was supposed to be weighting by only exptime?

Yes, and it turns out that got fixed in the 0.17.x releases. 🙄

@jdavies-st jdavies-st force-pushed the JP-1657-resample-ivm-weighting branch from d151abe to de2c78d Compare February 19, 2021 04:46
@hbushouse
Copy link
Collaborator

As mentioned above, some of our input regression test data is very old and predates the population of var_rnoise arrays in ramp fitting. This is the case for the simulated NIRSpec MOS data we have and a couple other datasets. For real data, there will always be var_rnoise. For this old data, is mocking it up with ones acceptable? Any other ideas?

Without mocking it up, the var_rnoise attributes gets auto-created with zeros, and then everything has zero weight and you get no data in the output.

Yes, I guess we'll need to fix the input files to the regtests by just adding some generic readnoise value. The simulated NRS data we get from the team is already in the form of a countrate image (equivalent to rate file) and hence it doesn't go through caldetector1 or ramp_fit to get var_rnoise populated. So just look at one of the NRS readnoise ref files and estimate a suitable value from that.

@jdavies-st jdavies-st force-pushed the JP-1657-resample-ivm-weighting branch from 0358737 to b2dd930 Compare February 19, 2021 19:58
@jdavies-st
Copy link
Collaborator Author

Yes, I guess we'll need to fix the input files to the regtests by just adding some generic readnoise value.

As you suggested, perhaps it would be best to check if the array exists. If it doesn't then use weight of 1 for everything and emit a warning. This would be a more general purpose solution that chasing down all our data that doesn't have a var_rnoise array in it.

@jdavies-st
Copy link
Collaborator Author

jdavies-st commented Feb 23, 2021

So, for datasets that start calwebb_spec2 with no var_rnoise array, it gets created pretty early on as soon as the first error propagation step happens. Certainly by flat_field. So by the time we get to resample it is already mostly zeros and exists. This presents a problem for dealing with it in a smart way in resample.

Also, at this point, it also has NaNs in it due scaling by the flat (I assume, which introduces the NaNs), so it's a mess - a combination of zeros and NaNs.

Will think on this a bit more and find a solution.

@jdavies-st jdavies-st force-pushed the JP-1657-resample-ivm-weighting branch from 069bd29 to a04b82c Compare February 24, 2021 05:14
@jdavies-st
Copy link
Collaborator Author

So the solution is to raise a ValueError if the weight map is all zeros, or a combination of zeros and non-finite numbers (np.inf, np.nan) with a useful error message. For data coming through Ops, all should have var_rnoise arrays.

@jdavies-st
Copy link
Collaborator Author

I have updated the rate.fits files in our nightly regression tests that are old enough to not have var_rnoise arrays. And added one to the NIRSpec MOS observation that is simulated as a rate file and was not run through ramp fitting.

This PR is done. Next step is to figure out what goes into the err array for these resampled data.

@hbushouse
Copy link
Collaborator

Given that JP-1567 was really just concerned with turning on IVM weighting, which this PR accomplishes, I wonder if it would be cleanest to also close JP-1567 and start a new ticket to continue the discussion of how to create an ERR array in resampled products. They really are 2 different things (as you've explained to me).

@jdavies-st jdavies-st force-pushed the JP-1657-resample-ivm-weighting branch from 00ecb35 to c283048 Compare February 25, 2021 05:30
@jdavies-st
Copy link
Collaborator Author

Well, it turns out raising errors in build_driz_weight raises havoc throughout the rest of our unit tests. I've opted for warnings for now. But happy to revisit or modify in a subsequent PR. I'm going to merge this and kick off regtests.

@jdavies-st jdavies-st merged commit 3057780 into spacetelescope:master Feb 25, 2021
@jdavies-st jdavies-st deleted the JP-1657-resample-ivm-weighting branch February 25, 2021 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement IVM weighting in resample step
3 participants