Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-1694: fix outlier detection #5601

Merged
merged 8 commits into from
Jan 14, 2021

Conversation

jemorrison
Copy link
Collaborator

@jemorrison jemorrison commented Jan 9, 2021

This PR goes a long way to fixing combined data for MIRI.
the parameter good_bits has to be set to (~DO_NOT_USE+NON_SCIENCE) to ignore both NON_SCIENCE data and BAD data set with DO_NOT_USE.
In addition the noise image used to flag outliers was updated to account for the background subtracted region

Resolves #5332 / JP-1694

Also resolves #5528 / JP-1806

@jemorrison jemorrison requested review from stsci-hack, jdavies-st and hbushouse and removed request for stsci-hack January 9, 2021 00:45
@codecov
Copy link

codecov bot commented Jan 9, 2021

Codecov Report

Merging #5601 (b13e8c7) into master (4fd2126) will decrease coverage by 0.31%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5601      +/-   ##
==========================================
- Coverage   74.47%   74.16%   -0.32%     
==========================================
  Files         416      416              
  Lines       38107    38617     +510     
  Branches     5157     5157              
==========================================
+ Hits        28379    28639     +260     
- Misses       9728     9978     +250     
Flag Coverage Δ *Carryforward flag
nightly 75.99% <100.00%> (-0.03%) ⬇️ Carriedforward from a22b26f
unit 54.17% <100.00%> (+<0.01%) ⬆️

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
jwst/flatfield/flat_field.py 64.81% <100.00%> (-16.99%) ⬇️
jwst/outlier_detection/outlier_detection.py 85.06% <100.00%> (ø)
jwst/associations/lib/rules_level3_base.py 91.15% <0.00%> (-1.68%) ⬇️
jwst/associations/lib/diff.py 96.02% <0.00%> (-0.01%) ⬇️
jwst/associations/lib/product_utils.py 100.00% <0.00%> (ø)
jwst/associations/lib/rules_level2b.py 96.68% <0.00%> (+0.29%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4fd2126...b13e8c7. Read the comment docs.

Copy link
Collaborator

@hbushouse hbushouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that the computed background needs to be added back into the blotted image because the individual cal image that it's going to be compared to still has the background signal in it, right? If so, that makes sense and hence that change seems appropriate.

I don't, however, see any change here related to the proper setting of good_bits.

@jemorrison
Copy link
Collaborator Author

I put the change for the good_bits in the cfg file I was using. I am not sure how other instruments handle NON_SCIENCE. So I don't k now if that is a global default change or something we put in the code or something that gets added to the pars-outlierdetection ? I am still trying to understand how the 'pars' method will work.

@hbushouse
Copy link
Collaborator

We recently stripped out all of the default parameter entries from all .cfg files for pipeline and steps, so now the defaults are set only in each step's spec block in the actual code module. For the outlier_detection step we have
https://github.com/spacetelescope/jwst/blob/master/jwst/outlier_detection/outlier_detection_step.py#L62
which sets the default good_bits to only ~DO_NOT_USE. So changing that to also include NON_SCIENCE in your .cfg file is fine for the purpose of experimenting (that is exactly what param settings in .cfg are for - using non-default values), but if we determine that this is appropriate as a new default value, then it should get modified in the spec settings in the code itself.

Theoretically, however, the approach we have taken in most, if not all, pipeline steps with regard to which DQ values to use to signify that a pixel is to be ignored is in fact just the DO_NOT_USE value. All other flag values are deemed informational only. So in this case, if we really want NON_SCIENCE pixels to be excluded from use in any/all steps, those pixels should also have the DO_NOT_USE flag set for them (in addition to NON_SCIENCE). If that were done, then no changes would be needed for the good_bits parameter here. For the particular case of MIRI imaging, we'd need to find out where the NON_SCIENCE DQ values are getting set (e.g. are they propagated in from the FLAT ref files?) and, if so, possibly update those FLAT ref files to also set DO_NOT_USE for those pixels.

@jdavies-st jdavies-st changed the title Outlier detection jp 1694 JP-1694: fix outlier detection for MIRI imaging Jan 11, 2021
@jemorrison
Copy link
Collaborator Author

jemorrison commented Jan 11, 2021

@hbushouse on the reason the background has to be added - yes the cal image still has the background signal in it.

@cracraft @mwregan2 @karllark Is there any reason why the pixels in the lower 3 masks are set to NON_SCIENCE and not NON_SCIENCE+DO_NOT_USE for the cal image pipelines ?
These pixels should not be used in the calimage3 pipeline because using them messes up the outlier detection step. They get included in the median image. It would be cleaner if we added DO_NOT_USE to their dqflags.

@cracraft
Copy link
Collaborator

Going back through emails with Ors, I found this note about the masking: Yes, the 4QPM should not be used. I did not mask the 4QPMs for 2 reasons. 1.) for onboard flats I needed them. 2.) Could be useful for image registration or could contain information that is valuable. Although I should mark NON_SCIENCE, to remove them from the mosaics, I totally agree. Is NO_FLAT_FIELD needed too?

We also talked about which mask values to use (though didn't discuss DO_NOT_USE) in JP-164.

@jemorrison
Copy link
Collaborator Author

jemorrison commented Jan 11, 2021

I can not find where MIRI Imager pixels are marked as NON_SCIENCE in the pipeline. Could someone point me ? I did a grep on NON_SCIENCE in the pipeline that pointed me to photom step. But when I look at the code it seems NON_SCIENCE is set for NIRSPec data but I can not find where it is set for MIRI imager data.

@cracraft
Copy link
Collaborator

NON_SCIENCE DQ values are in the flat files. It's not set as part of the pipeline, just read in from reference files.

@jemorrison
Copy link
Collaborator Author

@cracraft so would it cause down stream problems if as part of the flat field step we set pixels with NON_SCIENCE to NON_SCIENCE + DO_NOT_USE. You mentioned above that on board flats need these regions. Is that for making the flat fields not after they are applied.

@cracraft
Copy link
Collaborator

Ors would have to answer why he wanted to use the flags he did, and what effect he anticipated it having through the pipeline. I don't know if that would be a problem or not. I don't see that it would, but others should weigh in on that.

@jdavies-st
Copy link
Collaborator

I think it is fine to change the default in OutlierDetectionStep.spec to add NON_SCIENCE. MIRI is the only one that uses this, and I think it won't have an effect on any other instruments.

@hbushouse
Copy link
Collaborator

I think it is fine to change the default in OutlierDetectionStep.spec to add NON_SCIENCE. MIRI is the only one that uses this, and I think it won't have an effect on any other instruments.

That's a solution that achieves the proper results, but goes against the philosophy of having all steps only reject pixels flagged with DO_NOT_USE by default. Any other flag values, like NON_SCIENCE, are to be considered as information only and not cause for rejection. So it would be better to add DO_NOT_USE flags to these NON_SCIENCE pixels where necessary.

@jdavies-st
Copy link
Collaborator

jdavies-st commented Jan 11, 2021

Agree Howard, but as I mentioned in a previous thread on this topic, NON_SCIENCE might be DO_NOT_USE for one step, but fine to use on another step. I.e. if two steps that have different requirements, flat_field and outlier_detection in this case, there's no resolving it using a single dq bit in a single reffile.

In the end, this DQ flag is being propagated from the flat reference file. That doesn't seem ideal in controlling what happens in outlier_detection. Ideally you'd want this DQ flag populated with DO_NOT_USE specifically for the level3 pipeline.

@jemorrison
Copy link
Collaborator Author

@hbushouse
@jdavies-st
This PR is now ready for review. I added DO_NOT_USE to the pixels of NON_SCIENCE data after flat fielding.
So now adding the additional flag of NON_SCIENCE to good_bits is not needed in outlier_detection

@jemorrison jemorrison force-pushed the outlier_detection_JP_1694 branch from 4648868 to 01ed074 Compare January 13, 2021 15:36
Copy link
Collaborator

@jdavies-st jdavies-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the excellent debug here Jane! Just some questions and comments below.

@@ -421,7 +421,8 @@ def flag_cr(sci_image, blot_image, **pars):
#
# Model the noise and create a CR mask
diff_noise = np.abs(sci_data - blot_data)
ta = np.sqrt(np.abs(blot_data + subtracted_background) + err_data ** 2)
#ta = np.sqrt(np.abs(blot_data + subtracted_background) + err_data ** 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete the commented-out line.

@@ -406,7 +406,7 @@ def flag_cr(sci_image, blot_image, **pars):
exptime = sci_image.meta.exposure.exposure_time

sci_data = sci_image.data * exptime
blot_data = blot_image.data * exptime
blot_data = (blot_image.data +subtracted_background) * exptime
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get a space after that + operator.

Comment on lines +218 to +222
# Find all pixels in the flat that have a DQ value of NON_SCIENCE
# add the DO_NOT_USE flag to these pixels. We don't want to use these pixels
# in futher steps
flag_nonsci = np.bitwise_and(science.dq, dqflags.pixel['NON_SCIENCE']).astype(np.bool)
science.dq[flag_nonsci] = np.bitwise_or(science.dq[flag_nonsci], dqflags.pixel['DO_NOT_USE'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding NON_SCIENCE doesn't seem to have any effect on the issue described in the ticket. I.e. in the dataset from the ticket, running calwebb_image3 starting with the _cal.fits files as they are resolves the issue with just the changes in outlier_detection, without having to rerun flat_field on the input data.

So what is the purpose of this section of code? Currently the NON_SCIENCE area in MIRI data gets set to zero in the resampled _i2d.fits output. I.e., I don't see a problem with what is happening currently.

@jemorrison
Copy link
Collaborator Author

jemorrison commented Jan 13, 2021 via email

@jemorrison
Copy link
Collaborator Author

The preferred method is for outlier detection to have good_bits = DO_NOT_USE only. If we do not go back to the flat field step and flag NON_SCIENCE data as also DO_NOT_USE then this data is in the combined image. This region should NOT be in the combined image. The longer term fix is to get the MIRI team to set the NON_SCIENCE pixels in the flat field reference file to DO_NOT_USE + NON_SCIENCE. We can then take out this code.

@jdavies-st
Copy link
Collaborator

That makes sense Jane. Thanks. I think what I'm seeing is that outlier_detection is using the areas that are NON_SCIENCE, but because in the final resample of calwebb_image3, it throws out NON_SCIENCE in addition to DO_NOT_USE, it is masking whatever is happening in outlier. I.e. here

GOOD_BITS = '~DO_NOT_USE+NON_SCIENCE'

@jemorrison
Copy link
Collaborator Author

@jdavies-st
I can confirm that the line in resample_step.py setting the GOOD_BIT is confusing the issue. It is hard coding the values. I went in and changed GOOD_BITS ='~DO_NOT_USE' in the line and the resampled data now contains the NON_SCIENCE pixels. I still think it is best to flag these NON_SCIENCE pixels as also DO_NOT_USE because they are properly excluded from the i2d, median and blot data. We don't have to flag them in FLAT_FIELD it just seemed the place to do it since the NON_SCIENCE flag is coming from there.

@jemorrison jemorrison requested a review from jdavies-st January 14, 2021 13:51
@jemorrison
Copy link
Collaborator Author

@jdavies-st Can we ignore the Code style check - it seems to be related to a long line in a file that is not even part of the PR.
I don't know why it is throwing an error

@hbushouse
Copy link
Collaborator

@jdavies-st Can we ignore the Code style check - it seems to be related to a long line in a file that is not even part of the PR.
I don't know why it is throwing an error

I fixed the long line issue in a PR I recently merged. So it'll be gone once this gets merged with master.

@hbushouse hbushouse added this to the Build 7.7 milestone Jan 14, 2021
Copy link
Collaborator

@hbushouse hbushouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the comment by @jdavies-st accurate, in that flagging the NON_SCIENCE pixels with DO_NOT_USE makes no difference to the final product? Does it make any difference in outlier_detection?

Ah, never mind. Just caught up on the rest of the comment thread here, which verifies that we do in fact want the NON_SCIENCE to have DO_NOT_USE added to them. So that means the only thing that needs changing yet is the change log.

@@ -1,6 +1,14 @@
0.18.2 (unreleased)
===================

flat_field
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, but we like to keep the change entires in alphabetical order of step/pipeline name, so these should go below cube_build in the list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think I finally have the change log correct. I just edited it on GitHub

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks good to me now.

Copy link
Collaborator

@jdavies-st jdavies-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am giving this PR my highest approval. 🥇

🌟 🌟 🌟 🌟

This not only fixes the reported issue for MIRI, but it also makes outlier_detection actually work for all the other instruments.

For instance, here's NIRCam imaging with 3 NRCB5 detectors at different roll angles.

Screen Shot 2021-01-14 at 3 51 56 PM

On the left is the i2d file for jwst 0.18.1. On the right is this branch. Outliers are actually detected and removed.

Rockon @jemorrison 🎉

@jdavies-st jdavies-st changed the title JP-1694: fix outlier detection for MIRI imaging JP-1694: fix outlier detection Jan 14, 2021
Copy link
Collaborator

@hbushouse hbushouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nailed it.

@@ -1,6 +1,14 @@
0.18.2 (unreleased)
===================

flat_field
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks good to me now.

@hbushouse
Copy link
Collaborator

As soon as CI tests complete successfully I'll merge (hopefully in time for the nightly regtest run).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants