Adding initial C code to ramp fitting.

Adding the setup.py necessary to install C extensions for ramp fitting. Adding first attempt at C code. Adding a setup.cfg file to be used for setup. Updating calling location and C code. Updated include ordering, as well as returning NoneType. Can compile calling setup.py directly and running a basic script. Still cannot run 'pip install -e .'. This generates a failure saying it's unable to find the numpy module, raising ModuleNotFoundError. Updating setup. Fixing install files to properly work with C extension framework in ramp fitting. Changing names. Updating C code to parse input parameters. Adding ramp handling for each pixel. Updating ramp data structure and pixel ramp data structure. Getting a pixel ramp and printing it to the screen. Updating code style and adding a simple linked list to handle segment computations. Completed median rate computation without flags and without special cases. Cleaning up the code and comments. The non-special case median rate computation now works. Commenting out calls to C extension function. Putting functions in alphabetical order to make them easier to navigate. Alphabetizing the functions to make them easier to search. Finding a bug in the median rate computation. Updating setup and Nympy macro for C based on code review. Fixed the local copy of the DQ for an integration for median rate computation. Completed the median rate calculations that accounts for flags in the ramp. Beginning to update type checking for ndarrays passed to C. Figuring out endianness solution. Still need to figure out how to detect endianness. Checking for a computing byteswapping makes things slower than python. Testing and comparing python to C code. Working on the weighting of a segment for ramp fitting. Continuing with weighted fitting. Finished the segment computations. Removed 'real_t' typedef to make code cleaner. Finished pixel ramp computations but the read noise computation is different from python code. JIC commit. JIC commit. Debugging the segment computation of the read noise variance. Updated the read noise computations for normal segments. Updated the documentation for ramp fitting to make the documentation for clear on the details of the computations. Removed extra blank lines from CI test file. Creating output base arrays and fix use of pixeldq in pixel_ramp. Packaged up output results from the C computations to be passed back to the python code. Adding a square root to the final computation of the combined error for the rate and rateints products. Successful CI tests for basic ramps run through the C extension. JIC commit. Fixing segment pruner. Adding test cases for C extension. Adding cases. Updated the final DQ array. Started implementing the optional results product. Moving debugging print statements. Adding optional results memory managment during ramp fitting. Outputting the optional results product. Some of the values still need to be computed. Updating computing the optional results product. Updating dimensions of the pedestal array. Adding pedestal calculation. Updating where the group time divide happens. Adding special case computation and testing. Tracing bug in special case. Working out slope computation bug. Forcing the use of C extension to ensure CI tests use the C extensions. Updating tests and DQ flag computations. Adding special case for one group segment. Working on special cases. Updating one group ramp testing. The variance computation is correct now, but need to investigate the slope computation, which may also be wrong in python.
spacetelescope · Aug 7, 2023 · c65087c · c65087c
1 parent b5bd209
commit c65087c
Show file tree

Hide file tree

Showing 9 changed files with 3,575 additions and 46 deletions.
diff --git a/docs/stcal/ramp_fitting/description.rst b/docs/stcal/ramp_fitting/description.rst
@@ -158,21 +158,27 @@ the least-squares fit is calculated with the first and last samples. In most pra
 cases, the data will fall somewhere in between, where the weighting is scaled between the
 two extremes.
 
-The signal-to-noise ratio :math:`S` used for weighting selection is calculated from the
-last sample as:
+
+For segment :math:`k` of length :math:`n`, which includes groups :math:`[g_{k}, ...,
+g_{k+n-1}]`, the signal-to-noise ratio :math:`S` used for weighting selection is
+calculated from the last sample as:
 
 .. math::
     S = \frac{data \times gain} { \sqrt{(read\_noise)^2 + (data \times gain) } } \,,
 
+where :math:`data = g_{k+n-1} - g_{k}`.
+
 The weighting for a sample :math:`i` is given as:
 
 .. math::
-    w_i = (i - i_{midpoint})^P \,,
+    w_i = \frac{ [(i - i_{midpoint}) / i_{midpoint}]^P }{ (read\_noise)^2 } \,,
+
+where  :math:`i_{midpoint} = \frac{n-1}{2}` and :math:`i = 0, 1, ..., n-1`.
+
 
-where :math:`i_{midpoint}` is the the sample number of the midpoint of the sequence, and
-:math:`P` is the exponent applied to weights, determined by the value of :math:`S`. Fixsen
-et al. 2000 found that defining a small number of P values to apply to values of S was
-sufficient; they are given as:
+is the the sample number of the midpoint of the sequence, and :math:`P` is the exponent
+applied to weights, determined by the value of :math:`S`. Fixsen et al. 2000 found that
+defining a small number of P values to apply to values of S was sufficient; they are given as:
 
 +-------------------+------------------------+----------+
 | Minimum S         | Maximum S              | P        |
@@ -195,11 +201,13 @@ Segment-specific Computations:
 The variance of the slope of a segment due to read noise is:
 
 .. math::  
-   var^R_{s} = \frac{12 \ R^2 }{ (ngroups_{s}^3 - ngroups_{s})(tgroup^2) } \,,
+   var^R_{s} = \frac{12 \ R^2 }{ (ngroups_{s}^3 - ngroups_{s})(tgroup^2)(gain^2) } \,,
 
 where :math:`R` is the noise in the difference between 2 frames, 
 :math:`ngroups_{s}` is the number of groups in the segment, and :math:`tgroup` is the group 
-time in seconds (from the keyword TGROUP).  
+time in seconds (from the keyword TGROUP).  The divide by gain converts to
+:math:`DN`.  For the special case where as segment has length 1, the
+:math:`ngroups_{s}` is set to :math:`2`.
 
 The variance of the slope in a segment due to Poisson noise is: 
 
@@ -265,10 +273,10 @@ The combined variance of the slope is the sum of the variances:
 The square root of the combined variance is stored in the ERR array of the primary output.
 
 The overall slope depends on the slope and the combined variance of the slope of each integration's
-segments, so is a sum over integrations and segments:
+segments, so is a sum over integration values computed from the segements:
 
 .. math::    
-    slope_{o} = \frac{ \sum_{i,s}{ \frac{slope_{i,s}} {var^C_{i,s}}}} { \sum_{i,s}{ \frac{1} {var^C_{i,s}}}}
+    slope_{o} = \frac{ \sum_{i}{ \frac{slope_{i}} {var^C_{i}}}} { \sum_{i}{ \frac{1} {var^C_{i}}}}
 
 
 Upon successful completion of this step, the status keyword S_RAMP will be set

diff --git a/pyproject.toml b/pyproject.toml
@@ -45,6 +45,7 @@ requires = [
     'setuptools >=61',
     'setuptools_scm[toml] >=3.4',
     'wheel',
+    'numpy'
 ]
 build-backend = 'setuptools.build_meta'
 

diff --git a/setup.py b/setup.py
@@ -0,0 +1,25 @@
+import numpy
+from setuptools import setup, find_packages, Extension
+
+
+# package_data values are glob patterns relative to each specific subpackage.
+package_data = {
+    "stcal.ramp_fitting.src": ["*.c"],
+}
+
+# Setup C module include directories
+include_dirs = [numpy.get_include()]
+
+# Setup C module macros
+define_macros = [("NUMPY", "1")]
+
+setup(
+    ext_modules=[
+        Extension(
+            "stcal.ramp_fitting.slope_fitter",
+            ["src/stcal/ramp_fitting/src/slope_fitter.c"],
+            include_dirs=include_dirs,
+            define_macros=define_macros,
+        ),
+    ],
+)
diff --git a/src/stcal/ramp_fitting/ols_fit.py b/src/stcal/ramp_fitting/ols_fit.py
@@ -7,6 +7,7 @@
 
 import warnings
 
+from .slope_fitter import ols_slope_fitter  # c extension
 from . import ramp_fit_class
 from . import utils
 
@@ -19,6 +20,14 @@
 log = logging.getLogger(__name__)
 log.setLevel(logging.DEBUG)
 
+# !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+# !!!!!!!!!!!!!!!!!!!!!!!!  Debugging !!!!!!!!!!!!!!!!!!!!!!!!
+import sys
+sys.path.insert(1, "/Users/kmacdonald/code/common")
+from general_funcs import *
+# !!!!!!!!!!!!!!!!!!!!!!!!  Debugging !!!!!!!!!!!!!!!!!!!!!!!!
+# !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+
 
 def ols_ramp_fit_multi(
         ramp_data, buffsize, save_opt, readnoise_2d, gain_2d, weighting, max_cores):
@@ -655,6 +664,98 @@ def ols_ramp_fit_single(
     opt_info : tuple
         The tuple of computed optional results arrays for fitting.
     """
+
+    # XXX
+    # ramp_data.groupdq = ramp_data.groupdq * 0
+
+    dbg_print(f"{save_opt = }")
+    # use_c = ramp_data.dbg_run_c_code
+    # use_c = False
+    use_c = True
+    if use_c:
+        # XXX - Put C extension here.
+        print("=" * 80)
+        # XXX start C time
+        c_start = time.time()
+
+        if ramp_data.drop_frames1 is None:
+            ramp_data.drop_frames1 = 0
+        dbg_print("    ---------------- Entering C Code ----------------")
+        result = ols_slope_fitter(
+            ramp_data.data, ramp_data.groupdq, ramp_data.err, ramp_data.pixeldq,
+            gain_2d, readnoise_2d, ramp_data.zeroframe,
+            ramp_data.frame_time, ramp_data.group_time, ramp_data.groupgap,
+            ramp_data.nframes, ramp_data.drop_frames1,
+            ramp_data.flags_do_not_use, ramp_data.flags_jump_det,
+            ramp_data.flags_saturated, ramp_data.flags_no_gain_val,
+            ramp_data.flags_unreliable_slope, weighting, save_opt)
+        dbg_print("    ----------------  Return C Code  ----------------")
+
+        # XXX end C time
+        c_end = time.time()
+        print("=" * 80)
+
+        return result
+
+    # XXX start python time
+    p_start = time.time()
+
+    image_info, integ_info, opt_info = ols_ramp_fit_single_python(
+        ramp_data, buffsize, save_opt, readnoise_2d, gain_2d, weighting)
+
+    # XXX end python time
+    p_end = time.time()
+    '''
+    if use_c:
+        c_python_time_comparison(c_start, c_end, p_start, p_end)
+        # compare_image_info(img, image_info)
+        compare_image_info_count(img, image_info)
+    '''
+
+    return image_info, integ_info, opt_info
+
+
+
+def ols_ramp_fit_single_python(
+        ramp_data, buffsize, save_opt, readnoise_2d, gain_2d, weighting):
+    """
+    Fit a ramp using ordinary least squares. Calculate the count rate for each
+    pixel in all data cube sections and all integrations, equal to the weighted
+    slope for all sections (intervals between cosmic rays) of the pixel's ramp
+    divided by the effective integration time.
+
+    Parameters
+    ----------
+    ramp_data : RampData
+        Input data necessary for computing ramp fitting.
+
+    buffsize : int
+        The working buffer size
+
+    save_opt : bool
+        Whether to return the optional output model
+
+    readnoise_2d : ndarray
+        The read noise of each pixel
+
+    gain_2d : ndarray
+        The gain of each pixel
+
+    weighting : str
+        'optimal' is the only valid value
+
+    Return
+    ------
+    image_info : tuple
+        The tuple of computed ramp fitting arrays.
+
+    integ_info : tuple
+        The tuple of computed integration fitting arrays.
+
+    opt_info : tuple
+        The tuple of computed optional results arrays for fitting.
+    """
+
     tstart = time.time()
 
     if not ramp_data.suppress_one_group_ramps:
@@ -716,6 +817,15 @@ def ols_ramp_fit_single(
     return image_info, integ_info, opt_info
 
 
+def c_python_time_comparison(c_start, c_end, p_start, p_end):
+    c_diff = c_end - c_start
+    p_diff = p_end - p_start
+    c_div_p = c_diff / p_diff
+    dbg_print(f"{c_diff = :E}")
+    dbg_print(f"{p_diff = :E}")
+    dbg_print(f"{c_div_p = :E}")
+
+
 def discard_miri_groups(ramp_data):
     """
     For MIRI datasets having >1 group, if all pixels in the final group are
@@ -909,6 +1019,9 @@ def ramp_fit_slopes(ramp_data, gain_2d, readnoise_2d, save_opt, weighting):
     #   each segment are also calculated here.
 
     med_rates = utils.compute_median_rates(ramp_data)
+    print(DELIM)
+    dbg_print(f"{med_rates = }")
+    print(DELIM)
 
     # Loop over data integrations:
     for num_int in range(0, n_int):
@@ -1108,10 +1221,20 @@ def ramp_fit_compute_variances(ramp_data, gain_2d, readnoise_2d, fit_slopes_ans)
             # Suppress harmless arithmetic warnings for now
             warnings.filterwarnings("ignore", ".*invalid value.*", RuntimeWarning)
             warnings.filterwarnings("ignore", ".*divide by zero.*", RuntimeWarning)
+            var_p4_intermediate = den_p3 * med_rates[rlo:rhi, :]
+            var_p4[num_int, :, rlo:rhi, :] = var_p4_intermediate 
+            '''
             var_p4[num_int, :, rlo:rhi, :] = den_p3 * med_rates[rlo:rhi, :]
+            '''
 
             # Find the segment variance due to read noise and convert back to DN
+            # XXX JP-3121 Maybe readnoise should be zero where Poisson is zero.
+            var_r4_intermediate = num_r3 * den_r3 / gain_sect**2
+            var_r4_intermediate[var_p4_intermediate == 0.] = 0.
+            var_r4[num_int, :, rlo:rhi, :] = var_r4_intermediate
+            '''
             var_r4[num_int, :, rlo:rhi, :] = num_r3 * den_r3 / gain_sect**2
+            '''
 
             # Reset the warnings filter to its original state
             warnings.resetwarnings()
@@ -1139,6 +1262,7 @@ def ramp_fit_compute_variances(ramp_data, gain_2d, readnoise_2d, fit_slopes_ans)
         #   variance calculations.
         s_inv_var_p3[num_int, :, :] = (1. / var_p4[num_int, :, :, :]).sum(axis=0)
         var_p3[num_int, :, :] = 1. / s_inv_var_p3[num_int, :, :]
+
         s_inv_var_r3[num_int, :, :] = (1. / var_r4[num_int, :, :, :]).sum(axis=0)
         var_r3[num_int, :, :] = 1. / s_inv_var_r3[num_int, :, :]
 
@@ -1286,8 +1410,6 @@ def ramp_fit_overall(
 
     slope_by_var4 = opt_res.slope_seg.copy() / var_both4
 
-    del var_both4
-
     s_slope_by_var3 = slope_by_var4.sum(axis=1)  # sum over segments (not integs)
     s_slope_by_var2 = s_slope_by_var3.sum(axis=0)  # sum over integrations
 
@@ -1322,6 +1444,8 @@ def ramp_fit_overall(
 
     slope_int = the_num / the_den
 
+    del var_both4
+
     # Adjust DQ flags for NaNs.
     wh_nans = np.isnan(slope_int)
     dq_int[wh_nans] = np.bitwise_or(dq_int[wh_nans], ramp_data.flags_do_not_use)
@@ -1345,8 +1469,9 @@ def ramp_fit_overall(
         for num_int in range(0, n_int):
             dq_slice = groupdq[num_int, 0, :, :]
             opt_res.ped_int[num_int, :, :] = \
-                utils.calc_pedestal(ramp_data, num_int, slope_int, opt_res.firstf_int,
-                                    dq_slice, nframes, groupgap, dropframes1)
+                utils.calc_pedestal(
+                    ramp_data, num_int, slope_int, opt_res.firstf_int,
+                    dq_slice, nframes, groupgap, dropframes1)
 
         del dq_slice
 
@@ -2193,6 +2318,7 @@ def fit_next_segment_short_seg_not_at_end(
     inv_var[these_pix] += 1.0 / variance[these_pix]
     warnings.resetwarnings()
 
+
     # create array: 0...ngroups-1 in a column for each pixel
     arr_ind_all = np.array(
         [np.arange(ngroups), ] * c_mask_2d_init.shape[1]).transpose()
@@ -2205,6 +2331,7 @@ def fit_next_segment_short_seg_not_at_end(
 
     # Select pixels having at least 2 later good groups (these later good
     #   groups are a segment whose slope will be calculated)
+    # XXX Does this work properly?  It doesn't appear to.
     wh_more = np.where(tot_good_groups[these_pix] > 1)
     pix_more = these_pix[wh_more]
     start[pix_more] = end_locs[pix_more]
@@ -3589,6 +3716,7 @@ def calc_opt_sums(ramp_data, rn_sect, gain_sect, data_masked, mask_2d, xvalues,
     # get final valid group for each pixel for this semiramp
     ind_lastnz = fnz + mask_2d_sum - 1
 
+
     # get SCI value of initial good group for semiramp
     data_zero = data_masked[fnz, range(data_masked.shape[1])]
     fnz = 0
@@ -3637,7 +3765,6 @@ def calc_opt_sums(ramp_data, rn_sect, gain_sect, data_masked, mask_2d, xvalues,
     num_nz = c_mask_2d.sum(0)  # number of groups in segment
     nrd_prime = (num_nz - 1) / 2.
     num_nz = 0
-
     # Calculate inverse read noise^2 for use in weights
     # Suppress, then re-enable, harmless arithmetic warning
     warnings.filterwarnings("ignore", ".*divide by zero.*", RuntimeWarning)
@@ -3664,14 +3791,15 @@ def calc_opt_sums(ramp_data, rn_sect, gain_sect, data_masked, mask_2d, xvalues,
         xvalues[:, wh_m2d_f] = np.roll(xvalues[:, wh_m2d_f], -1, axis=0)
         wh_m2d_f = np.logical_not(c_mask_2d[0, :])
 
+    c_data_masked = data_masked.copy()
+    c_data_masked[np.isnan(c_data_masked)] = 0.
+
     # Create weighted sums for Poisson noise and read noise
     nreads_wtd = (wt_h * c_mask_2d).sum(axis=0)  # using optimal weights
 
     sumx = (xvalues * wt_h).sum(axis=0)
     sumxx = (xvalues**2 * wt_h).sum(axis=0)
 
-    c_data_masked = data_masked.copy()
-    c_data_masked[np.isnan(c_data_masked)] = 0.
     sumy = (np.reshape((c_data_masked * wt_h).sum(axis=0), sumx.shape))
     sumxy = (xvalues * wt_h * np.reshape(c_data_masked, xvalues.shape)).sum(axis=0)
 

diff --git a/src/stcal/ramp_fitting/ramp_fit_class.py b/src/stcal/ramp_fitting/ramp_fit_class.py
@@ -43,6 +43,8 @@ def __init__(self):
 
         self.current_integ = -1
 
+        self.dbg_run_c_code = False
+
     def set_arrays(self, data, err, groupdq, pixeldq):
         """
         Set the arrays needed for ramp fitting.
@@ -177,15 +179,19 @@ def dbg_print_basic_info(self):
 
         print(f"Shape : {self.data.shape}")
         print(f"data : \n{self.data}")
-        print(f"err : \n{self.err}")
         print(f"groupdq : \n{self.groupdq}")
-        print(f"pixeldq : \n{self.pixeldq}")
+        # print(f"err : \n{self.err}")
+        # print(f"pixeldq : \n{self.pixeldq}")
         print("-" * 80)
 
 
     def dbg_print_pixel_info(self, row, col):
         print("-" * 80)
-        print(f"    data :\n{self.data[:, :, row, col]}")
-        print(f"    err :\n{self.err[:, :, row, col]}")
-        print(f"    groupdq :\n{self.groupdq[:, :, row, col]}")
-        print(f"    pixeldq :\n{self.pixeldq[row, col]}")
+        print(f"    data")
+        for integ in range(self.data.shape[0]):
+            print(f"[{integ}] {self.data[integ, :, row, col]}")
+        print(f"    groupdq")
+        for integ in range(self.data.shape[0]):
+            print(f"[{integ}] {self.groupdq[integ, :, row, col]}")
+        # print(f"    err :\n{self.err[:, :, row, col]}")
+        # print(f"    pixeldq :\n{self.pixeldq[row, col]}")