gkernel result decaying rapidly towards zero #392

alexkjames · 2023-04-19T18:56:42Z

Describe the bug
gkernel is not functioning as it should, values are moving exponentially towards zero

To Reproduce
To demonstrate the unusual behavior run

ts = pyleo.utils.load_dataset('LR04')
ts.gkernel().plot()

Expected behavior
Shouldn't decay towards zero

Additional context
This is likely due to code introduced by #381, which divides the value axis of the gkernel method by the sum of the weights each time values are calculated, resulting in the series decaying to zero very quickly.

CommonClimate · 2023-04-19T20:34:26Z

Thanks for catching my mistake. I see that things are working fine for LR04 now, but I just tried it on EDC dD and I get this:

It seems like the default choice of bin size (median) is too fine, so gkernel creates lots of empty segments. I tried to use step_type='max' instead, but:

(a) nomenclature is inconsistent with MultipleSeries.common_time(), which calls this variable step_style; shouldn't they be aligned?
(b) I get an error message with step_type='max':

/Users/julieneg/Documents/GitHub/Pyleoclim_util/pyleoclim/utils/tsutils.py:477: RuntimeWarning: invalid value encountered in double_scalars
  yc[i]  = sum(weight*yslice)/sum(weight) # normalize by the sum of weights
Out[9]: 
(<Figure size 720x288 with 1 Axes>,
 <Axes: xlabel='Age [y BP]', ylabel='$\\delta \\mathrm{D}$ [‰]'>)

Can you please look into it and see what is lurking behind?

alexkjames · 2023-04-19T20:52:10Z

a) I believe they aren't aligned because we didn't want to change the APIs and break legacy examples (which I think is wise)
b) I'll look into it, that's a strange error.

alexkjames · 2023-04-19T21:40:41Z

@CommonClimate I did some testing and I realized that gkernel wasn't actually acknowledging the passed step_type, nor has it ever acknowledged step_type. The parameter was overwritten in old versions of the code base, and in the new version it didn't get passed to the underlying utility. This has been fixed in my most recent commits.

In the example you attached, what's being shown is in fact the result of the usage of the 'max' step_style/step_type. However, the default parameter value for h, the kernel e-folding scale, means that the calculated weights are tiny for windows that contain only one or two data points in the low resolution portion of the series. When gkernel tries to calculate values using these tiny weights, python raises a warning and returns a nan value because we try to divide by a very, very small number.

I changed the docstring to warn people about this behavior. The function works properly otherwise as far as I can tell, it just requires that the user exercise some caution.

CommonClimate · 2023-04-19T21:47:48Z

hmm, i'll see if we can get it to behave more like bin() by making h a function of the increment size.

re: alignment, i'll add a step_style parameter and send a deprecation warning for step_type.

alexkjames · 2023-04-19T21:51:25Z

I think it has less to do with step size than it does with the minimum number of points in a given window. As far as I can tell if you have windows that have only one or two points in them, the weights will be tiny and python will be unhappy (unless h is large). Making it so that windows that only contain a single point just return that point might alleviate the issue somewhat, although you can still get really low weight values for windows that have just two or three points as well (edit: I just tried this and it didn't seem to make much of a difference, so its probably best to try your approach).

sounds good on the deprecation warning

CommonClimate · 2023-04-19T22:26:48Z

actually, I just got this with your code:

import pyleoclim as pyleo
ts = pyleo.utils.load_dataset('EDC-dD').convert_time_unit('ky BP')
fig, ax = ts.plot() 
tsg = ts.gkernel(step_type='max')
tsg.label = ts.label + ' (gkernel)'
tsg.plot(ax=ax,color='C1')
fig.tight_layout()

that's exactly what I was hoping the code would do, so no change needed. I'll just update the parameters and push a fix

CommonClimate · 2023-04-19T23:31:24Z

addressed by #393

alexkjames added the bug label Apr 19, 2023

alexkjames self-assigned this Apr 19, 2023

alexkjames changed the title ~~gkernel producing stranger results~~ gkernel result decaying rapidly towards zero Apr 19, 2023

alexkjames mentioned this issue Apr 19, 2023

fix gkernel #393

Merged

CommonClimate closed this as completed Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gkernel result decaying rapidly towards zero #392

gkernel result decaying rapidly towards zero #392

alexkjames commented Apr 19, 2023

CommonClimate commented Apr 19, 2023

alexkjames commented Apr 19, 2023

alexkjames commented Apr 19, 2023

CommonClimate commented Apr 19, 2023

alexkjames commented Apr 19, 2023 •

edited

Loading

CommonClimate commented Apr 19, 2023

CommonClimate commented Apr 19, 2023

gkernel result decaying rapidly towards zero #392

gkernel result decaying rapidly towards zero #392

Comments

alexkjames commented Apr 19, 2023

CommonClimate commented Apr 19, 2023

alexkjames commented Apr 19, 2023

alexkjames commented Apr 19, 2023

CommonClimate commented Apr 19, 2023

alexkjames commented Apr 19, 2023 • edited Loading

CommonClimate commented Apr 19, 2023

CommonClimate commented Apr 19, 2023

alexkjames commented Apr 19, 2023 •

edited

Loading