-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JP-1887: refactor OLS ramp fitting. #5872
JP-1887: refactor OLS ramp fitting. #5872
Conversation
…prove readability.
Codecov Report
@@ Coverage Diff @@
## master #5872 +/- ##
==========================================
+ Coverage 77.90% 78.19% +0.28%
==========================================
Files 399 399
Lines 35209 36118 +909
==========================================
+ Hits 27430 28241 +811
- Misses 7779 7877 +98
*This pull request uses carry forward flags. Click here to find out more.
Continue to review full report at Codecov.
|
jwst/ramp_fitting/ramp_fit.py
Outdated
if opt_model is not None | ||
|
||
save_opt : boolean | ||
calculate optional fitting results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit in wording, but this should perhaps say "save optional fitting results", because the results have in fact already been calculated. All we're doing here is putting them into an optional output product.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this wording.
if dropframes1 is None: # set to default if missing | ||
dropframes1 = 0 | ||
log.debug('Missing keyword DRPFRMS1, so setting to default value of 0') | ||
input_model.data = data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a good feel for whether stuffing these individual data arrays back into a new data model is going to lead to a big jump in memory usage for the step as a whole? We already had an input data model to the step, from which these data arrays were extracted, and now we're putting them back into another data model. I know it's cleaner that way in terms of programming, but this step is prone to memory issues, due to all the very large arrays being used. So I'm worried about creating what amounts to another copy of the data. Would it be possible to just pass in the original data model or does that not work because of the slicing of data needed for multiprocessing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a wrapper I used because I changed the function prototype for ols_ramp_fit
to ols_ramp_fit_single
. This is an intermediate fix to continue refactoring the OLS code without having to refactor the multi-processing code right now, also. Ultimately, this unpacking and repacking will be refactored out.
err: ndarray | ||
Error array | ||
int_times : None | ||
Not used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if it's not used, why not remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't remove it because I wasn't sure why it was here to begin with. I will remove it if I determine it's not actually used or can easily be gotten from the classes.
jwst/ramp_fitting/ramp_fit.py
Outdated
|
||
orig_cubeshape : (int, int, int) tuple | ||
Original shape of input dataset | ||
|
||
orig_nreads : int | ||
orig_ngroups: int | ||
Original number of reads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should modify this comment here too, to change "reads" to "groups"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Since our discussion before I have been removing references to "read" that really indicates "group" in JWST. I changed this documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me, but they produce some significant floating point differences on a very small number of science pixels on at least 1 of our regression tests, test_nirspec_detector1[rate]
. Here I ran all regression tests that run ramp fitting and produce a rate file:
https://plwishmaster.stsci.edu:8081/job/RT/job/jdavies-dev/228/
Ignore the keyword header failures, that's just because this branch is not rebased to the most recent master.
When I run all these pytest jwst/regtest -k rate
tests locally, I see the same differences, and I actually see slightly different floating point differences on another test:
_______________________________________________ test_verify_detector1[rate] ________________________________________________
run_detector1 = None
rtdata_module = {'input': '/private/tmp/scratch/test_nirspec_verify_rtdata_module0/nrs_verify_nrs1_uncal.fits',
'input_remote': 'jwst...h/nrs_verify_nrs1_rate.fits',
'truth_remote': 'jwst-pipeline/dev/truth/test_nirspec_verify/nrs_verify_nrs1_rate.fits'}
fitsdiff_default_kwargs = {'atol': 1e-07, 'ignore_fields': ['DATE', 'CAL_VER', 'CAL_VCS', 'CRDS_VER', 'CRDS_CTX', 'NAXIS1', ...], 'ignore_hdus': ['ASDF'], 'ignore_keywords': ['DATE', 'CAL_VER', 'CAL_VCS', 'CRDS_VER', 'CRDS_CTX', 'NAXIS1', ...], ...}
suffix = 'rate'
@pytest.mark.bigdata
@pytest.mark.parametrize(
'suffix', [
'dq_init', 'saturation', 'superbias', 'refpix', 'linearity',
'dark_current', 'jump', 'rate',
])
def test_verify_detector1(run_detector1, rtdata_module, fitsdiff_default_kwargs, suffix):
"""Test results of the detector1 and image2 processing"""
rtdata = rtdata_module
rtdata.input = replace_suffix(ROOT, 'uncal') + '.fits'
output = replace_suffix(ROOT, suffix) + '.fits'
rtdata.output = output
rtdata.get_truth('truth/test_nirspec_verify/' + output)
diff = FITSDiff(rtdata.output, rtdata.truth, **fitsdiff_default_kwargs)
> assert diff.identical, diff.report()
E AssertionError:
E fitsdiff: 4.2
E a: /private/tmp/scratch/test_nirspec_verify_rtdata_module0/nrs_verify_nrs1_rate.fits
E b: /private/tmp/scratch/test_nirspec_verify_rtdata_module0/truth/nrs_verify_nrs1_rate.fits
E HDU(s) not to be compared:
E ASDF
E Keyword(s) not to be compared:
E CAL_VCS CAL_VER CRDS_CTX CRDS_VER DATE NAXIS1 TFORM*
E Table column(s) not to be compared:
E CAL_VCS CAL_VER CRDS_CTX CRDS_VER DATE NAXIS1 TFORM*
E Maximum number of different data values to be reported: 10
E Relative tolerance: 1e-05, Absolute tolerance: 1e-07
E
E Extension HDU 1 (SCI, 1):
E
E Data contains differences:
E Data differs at [619, 5]:
E a> 0.30653778
E b> 0.30652106
E Data differs at [946, 13]:
E a> 0.5124498
E b> 0.51245815
E Data differs at [432, 14]:
E a> 1.6663404
E ? - ^
E b> 1.6664746
E ? ^ +
E Data differs at [785, 14]:
E a> 2.4191005
E ? ^^
E b> 2.4191675
E ? ^^
E Data differs at [163, 24]:
E a> -0.904209
E b> -0.9047445
E Data differs at [506, 56]:
E a> -1.5337013
E b> -1.5331658
E Data differs at [736, 56]:
E a> 1.3189616
E b> 1.3194972
E Data differs at [822, 59]:
E a> -1.0683262
E ? ^^
E b> -1.0683428
E ? + ^
E Data differs at [475, 65]:
E a> 0.19733553
E b> 0.19734389
E Data differs at [432, 74]:
E a> 4.538215
E ? ^^
E b> 4.538282
E ? ^^
E ...
E 227 different pixels found (0.01% different).
E
E Extension HDU 2 (ERR, 1):
E
E Data contains differences:
E Data differs at [736, 56]:
E a> 0.7799795
E b> 0.7800025
E Data differs at [654, 365]:
E a> 0.70179623
E b> 0.7017717
E Data differs at [121, 683]:
E a> 0.70261586
E b> 0.7025893
E Data differs at [668, 1554]:
E a> 0.7394267
E ? -
E b> 0.7394027
E ? +
E Data differs at [1548, 1625]:
E a> 0.74220675
E b> 0.7422182
E Data differs at [2040, 1659]:
E a> 0.7310654
E b> 0.73107785
E Data differs at [608, 1945]:
E a> 1.0903927
E ? --
E b> 1.0903755
E ? ++
E 7 different pixels found (0.00% different).
E
E Extension HDU 4 (VAR_POISSON, 1):
E
E Data contains differences:
E Data differs at [619, 5]:
E a> 0.020912398
E b> 0.020911254
E Data differs at [946, 13]:
E a> 0.03960225
E ? ^^
E b> 0.039602898
E ? ^^^
E Data differs at [432, 14]:
E a> 0.11363168
E b> 0.11364081
E Data differs at [785, 14]:
E a> 0.16637993
E b> 0.16638453
E Data differs at [736, 56]:
E a> 0.088048115
E b> 0.088083856
E Data differs at [475, 65]:
E a> 0.014082874
E ? ^^ -
E b> 0.01408347
E ? ^^
E Data differs at [432, 74]:
E a> 0.3185729
E ? ^^
E b> 0.3185776
E ? ^^
E Data differs at [1560, 76]:
E a> 0.027786894
E b> 0.027786272
E Data differs at [166, 182]:
E a> 0.016576806
E ? ^^
E b> 0.016579064
E ? ^ +
E Data differs at [434, 201]:
E a> 0.016168522
E b> 0.016169054
E ...
E 64 different pixels found (0.00% different).
E
E assert False
E + where False = <astropy.io.fits.diff.FITSDiff object at 0x7fb6b09cad60>.identical
/Users/jdavies/dev/jwst/jwst/regtest/test_nirspec_verify.py:76: AssertionError
I suspect this is because we are doing calculations in ramp fitting with np.float32
dtypes in our arrays, and order of operations matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failure seen in
jwst/regtest/test_nirspec_subarray.py
was due to the input data claiming meta.exposure.ngroups = 6
but having only 2 groups in the data, which was inconsistent and incorrect. This caused the issue in the code were sometimes that value was pulled from the data shape and sometimes from meta
. It is now always pulled from the data shape, which I think is the correct approach. This errant dataset is now fixed, and all the regression tests run fine on both master and in this branch.
I think this is ready to merge after removing the last 2 commits relating to the g_pix
comments, and adding an entry to CHANGES.rst.
Refactored OLS ramp fitting functions to reduced the number of function parameters and returns, simplifying the code to make it easier to read.
Addresses #5701 / JP-1887