-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JP-2096: Cube build stage3 #6093
JP-2096: Cube build stage3 #6093
Conversation
PR ready for review. |
See information on JP-2096 on numba and speed tests D Law has run with the code. |
Once it is approved that numba can be used in JWST Pipeline I can make further speed improvements in the blotting routines and have them use numba. I am going to hold off on that until it is confirmed numba is fine to use the jwst pipeline |
Codecov Report
@@ Coverage Diff @@
## master #6093 +/- ##
==========================================
- Coverage 77.69% 76.56% -1.14%
==========================================
Files 402 404 +2
Lines 34412 35259 +847
==========================================
+ Hits 26736 26995 +259
- Misses 7676 8264 +588
*This pull request uses carry forward flags. Click here to find out more.
Continue to review full report at Codecov.
|
@jdavies-st or @eslavich any ideas what's killing the CI test "CI/Installed package with --pyargs"? |
It looks like |
If you search the log for "FAILURES" you can see it is having trouble |
We should be opening those reffiles in a cleaner way, i.e. with a |
@jdavies-st @nden if you have time now could you look this over or suggest someone else to look at it. |
@hbushouse @nden @mcara |
Following the usual means of checking out a PR, I'm getting errors about being unable to find the C modules. E.g., I'm probably just doing something wrong- do I need to do anything special to compile the code inside the src/ directory first or should python do that automatically? |
To compile the c code. In the top jwst directory - the directory with setup.py |
index_x = np.where(xdistance <= roi_det) | ||
index_y = np.where(ydistance <= roi_det) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with the algorithm, but it seems like index_x
and index_y
, in principle, could have different lengths or point to different "pixels". Would, this be an issue? Especially different lengths in the code just below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to hold off making changes to blot_cube.py because I have another JP ticket to work just on blot cube after I get the c extensions in this PR committed. I will come back to these changes suggestions later this week.
|
||
d1pix = x_cube[ipt] - xcenter[index_x] | ||
d2pix = y_cube[ipt] - ycenter[index_y] | ||
|
||
dxy = [(dx * dx + dy * dy) for dy in d2pix for dx in d1pix] | ||
dxy = np.array(dxy) | ||
dxy = np.sqrt(dxy) | ||
weight_distance = np.exp(-dxy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d1pix = x_cube[ipt] - xcenter[index_x] | |
d2pix = y_cube[ipt] - ycenter[index_y] | |
dxy = [(dx * dx + dy * dy) for dy in d2pix for dx in d1pix] | |
dxy = np.array(dxy) | |
dxy = np.sqrt(dxy) | |
weight_distance = np.exp(-dxy) | |
weight_distance = np.exp(-np.sqrt(np.add.outer( | |
np.square(x_cube[ipt] - xcenter[index_x]), | |
np.square(y_cube[ipt] - ycenter[index_y]) | |
).ravel())) |
or, alternatively:
d1pix = x_cube[ipt] - xcenter[index_x] | |
d2pix = y_cube[ipt] - ycenter[index_y] | |
dxy = [(dx * dx + dy * dy) for dy in d2pix for dx in d1pix] | |
dxy = np.array(dxy) | |
dxy = np.sqrt(dxy) | |
weight_distance = np.exp(-dxy) | |
weight_distance = np.exp(-np.linalg.norm(np.meshgrid( | |
y_cube[ipt] - ycenter[index_y], | |
x_cube[ipt] - xcenter[index_x] | |
), axis=0).ravel()) |
index2d = [iy * blot_xsize + ix for iy in index_y[0] for ix in (index_x[0] + xstart)] | ||
index2d = np.array(index2d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index2d = [iy * blot_xsize + ix for iy in index_y[0] for ix in (index_x[0] + xstart)] | |
index2d = np.array(index2d) | |
index2d = np.add.outer(index_y[0] * blot_xsize, index_x[0] + xstart).ravel() |
ts1 = time.time() | ||
log.debug(f"Time to map 1 slice = {ts1-ts0:.1f}") | ||
log.debug(f"Time to blot 1 slice on NIRspec = {ts1-ts0:.1f}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is timing for one slice relevant even for debugging purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is for NIRSPEC. It can take several seconds per slice. Once we get it faster I will remove the debug timing
index2d = [iy * blot_xsize + ix for iy in index_y[0] for ix in (index_x[0])] | ||
blot_flux[index2d] = blot_flux[index2d] + weighted_flux | ||
blot_weight[index2d] = blot_weight[index2d] + weight_distance | ||
blot_cube.blot_overlap(ipt, xstart, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just set xstart
to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
holding off on blot changes - I have opened a separate ticket on blotting
jwst/cube_build/src/cube_match_sky.c
Outdated
int *idqv = NULL; // int vector for spaxel | ||
|
||
if (mem_alloc_dq(ncube, &idqv)) return 1; | ||
|
||
// Set all data to zero | ||
for (long i = 0; i < ncube; i++){ | ||
idqv[i] = 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int *idqv = NULL; // int vector for spaxel | |
if (mem_alloc_dq(ncube, &idqv)) return 1; | |
// Set all data to zero | |
for (long i = 0; i < ncube; i++){ | |
idqv[i] = 0; | |
} | |
int *idqv; // int vector for spaxel | |
if (mem_alloc_dq(ncube, &idqv)) return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
jwst/cube_build/src/cube_match_sky.c
Outdated
int *idqv = NULL; // int vector for spaxel | ||
|
||
if (mem_alloc_dq(ncube, &idqv)) return 1; | ||
|
||
// Set all data to zero | ||
for (long i = 0; i < ncube; i++){ | ||
idqv[i] = 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int *idqv = NULL; // int vector for spaxel | |
if (mem_alloc_dq(ncube, &idqv)) return 1; | |
// Set all data to zero | |
for (long i = 0; i < ncube; i++){ | |
idqv[i] = 0; | |
} | |
int *idqv; // int vector for spaxel | |
if (mem_alloc_dq(ncube, &idqv)) return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
double c2_min; | ||
double c1_max; | ||
double c2_max; | ||
int status = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int status = 0; | |
int status; |
jwst/cube_build/src/cube_match_sky.c
Outdated
double *fluxv = NULL, *weightv=NULL, *varv=NULL ; // vectors for spaxel | ||
double *ifluxv = NULL; // vector for spaxel | ||
|
||
// allocate memory to hold output | ||
if (mem_alloc(ncube, &fluxv, &weightv, &varv, &ifluxv)) return 1; | ||
|
||
double set_zero=0.0; | ||
// Set all data to zero | ||
for (int i = 0; i < ncube; i++){ | ||
varv[i] = set_zero; | ||
fluxv[i] = set_zero; | ||
ifluxv[i] = set_zero; | ||
weightv[i] = set_zero; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified as in cube_match_internal.c
if using alloc_flux_arrays
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion. I moved alloc_flux_arrays to cube_utils.c
jwst/cube_build/src/cube_match_sky.c
Outdated
if (mem_alloc(ncube, &fluxv, &weightv, &varv, &ifluxv)) return 1; | ||
|
||
double set_zero=0.0; | ||
// Set all data to zero | ||
for (int i = 0; i < ncube; i++){ | ||
varv[i] = set_zero; | ||
fluxv[i] = set_zero; | ||
ifluxv[i] = set_zero; | ||
weightv[i] = set_zero; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this can be simplified as in cube_match_internal.c
if using alloc_flux_arrays
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,484 @@ | |||
/* | |||
The detector pixels are represented by a 'point could' on the sky. The IFU cube is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "could" -> "cloud"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
/* | ||
The detector pixels are represented by a 'point could' on the sky. The IFU cube is | ||
represented by a 3-D regular grid. This module finds the point cloud members contained | ||
in a region centered on the center of the cube spaxel. The size of the spaxel is spatial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? Do you mean "size of the spaxel in spatial coords?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
The detector pixels are represented by a 'point could' on the sky. The IFU cube is | ||
represented by a 3-D regular grid. This module finds the point cloud members contained | ||
in a region centered on the center of the cube spaxel. The size of the spaxel is spatial | ||
coordinates is cdetl1 and cdelt2, while the wavelength size is zcdelt3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? should "zcdelt3" be just "cdelt3"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and "cdetl1" should be "cdelt1"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
represented by a 3-D regular grid. This module finds the point cloud members contained | ||
in a region centered on the center of the cube spaxel. The size of the spaxel is spatial | ||
coordinates is cdetl1 and cdelt2, while the wavelength size is zcdelt3. | ||
This module uses the e modified shephard weighting method to determine how to weight each point clold member |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really mean "the e modified shepard weighting" or is the "e" extraneous? And another instance of "clold" that should be "cloud".
jwst/cube_build/src/cube_match_sky.c
Outdated
@@ -0,0 +1,1441 @@ | |||
/* | |||
The detector pixels are represented by a 'point could' on the sky. The IFU cube is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"could" -> "cloud"
The detector pixels are represented by a 'point could' on the sky. The IFU cube is | ||
represented by a 3-D regular grid. This module finds the point cloud members contained | ||
in a region centered on the center of the cube spaxel. The size of the spaxel is spatial | ||
coordinates is cdetl1 and cdelt2, while the wavelength size is zcdelt3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the same typos as the same paragraph in the previous module (cdetl1, zcdelt3, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed typos
setup.cfg
Outdated
@@ -35,6 +35,7 @@ install_requires = | |||
gwcs>=0.16.1 | |||
jsonschema>=3.0.2 | |||
numpy>=1.17 | |||
numba>=0.50.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we're using C extensions instead of numba, can this be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I forgot I put that in
@jdavies-st @eslavich |
It looks like it was getting a corrupted CRDS reference file. I kicked off the CI again. Hopefully it won't be corrupted this time. |
64d4bfa
to
0de907f
Compare
I made the changes suggested in the review. I reran the regression tests and they look good. There are some expected differences because I improved the DQ flags for edge cases and I made a few other little improvements. |
FYI the doc build is failing because the new C module cannot be imported for the doc build. @eslavich do you recall what the issue was for C extensions and doc builds? |
We made a slight change in docs/conf.py or something like that to remove a line that was causing it to look for things in a parent directory. Look at https://github.com/spacetelescope/jwst/pull/6207/files |
I'm seeing some larger changes in the data cubes than I would have expected given this just changes the language not the algorithm; let me investigate further and get back to you. Did anything change that should have affected the total cube FOV? |
@drlaw1558 I tweaked the DQ flagging. Hopefully improving it near boundaries. I also tweaked- for MIRI - how the wavelength range that is used to build the cube is determined. There was a small bug in the old code. |
We merged a fix for the doc build failure in #6230, so a rebase ought to get that unstuck. |
Ok, disregard my last comment- the changes that I was seeing were due to something changing earlier in the pipeline (that I'll have to run down elsewhere), not cube build. Performance looks good to me; when the test is set up properly SCI results from running spec3 in multiple different modes look identical before/after this change but with vastly improved runtimes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Focusing entirely on results, the performance of this PR looks good to me. Running some test cases with dithered exposures in multiple bands through a variety of different kinds of cube building I get identical SCI results before/after this change but with vastly improved runtimes. DQ arrays look improved.
@hbushouse can we merge this PR ? |
|
||
// loop over each valid point on detector and find match to IFU cube based | ||
// on along slice coordinate and wavelength | ||
for (int ipixel= 0; ipixel< npt; ipixel++){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jemorrison I believe you need to declare the variables in the beginning of the file as certain compilers won't like the definition within the for loop.
double wave_min = 10000; | ||
double along_max = -10000; | ||
double wave_max = -10000; | ||
for (int j = 0; j< 4; j++){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
|
||
int nplane = naxis1 * naxis2; | ||
// loop over possible overlapping cube pixels | ||
for(int zz =iz1; zz < iz2+1; zz++){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
for(int zz =iz1; zz < iz2+1; zz++){ | ||
double zcenter = zcoord[zz]; | ||
int istart = zz * nplane; | ||
for (int aa= ia1; aa< ia2 + 1; aa++){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
@nden nxy = nx * ny The code works fine doing this on my Mac. I am now thinking this may not be allowed - dynamically allocated the array. Should I change how I do this ? |
* updates using numba jit * flake 8 fixes * a few numba updates * updates to support internal_cal and numba * fix test - removing unused resolution file * improved blotting speed using numba * added c code for emsm * fixed setup.py to compile match_det_cube * updates to c code * some changes to c python interface * more fixes to c code * added cube_match_internal and pulled common c routines to cube_utils.c * Clean up * remove cube_cloud.py * added weighting=msm as possibility for c extension cube weighting * removed declaration of numba from routine * fix typo * flake8 fix * remove printf from c code * remove print in ifu_cube.py * typo in cube_match_sky.c * changes after review * fix alloc arrays def * Updated change log * remove print statement (cherry picked from commit 7a8738b)
* updates using numba jit * flake 8 fixes * a few numba updates * updates to support internal_cal and numba * fix test - removing unused resolution file * improved blotting speed using numba * added c code for emsm * fixed setup.py to compile match_det_cube * updates to c code * some changes to c python interface * more fixes to c code * added cube_match_internal and pulled common c routines to cube_utils.c * Clean up * remove cube_cloud.py * added weighting=msm as possibility for c extension cube weighting * removed declaration of numba from routine * fix typo * flake8 fix * remove printf from c code * remove print in ifu_cube.py * typo in cube_match_sky.c * changes after review * fix alloc arrays def * Updated change log * remove print statement
Description
Speeding up cube_build using numba. Numba was installed using 'pip'
The purpose of this PR is to speed up cube_build using numba. Some of the modules in ifu_cube.py were moved out
of the class and made independent routines. Some of these routines were broken up into simpler routines. In this process the weighting by the Miri psi option was removed. It is not being used and it was getting cumbersome to it as an option. Removing Miri ps weighting as an option also means that the resolution reference files is not longer needed.
Including this PR in the JW pipeline requires numba to be added. But so far there seems little down side to including numba. It is fast, stable and easy to use.
Closes #6064
Fixes JP-2096