Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match hist bins #1944

Merged
merged 8 commits into from
Aug 15, 2017
Merged

Match hist bins #1944

merged 8 commits into from
Aug 15, 2017

Conversation

alexcjohnson
Copy link
Collaborator

Fixes #50 and #1864

  • All autobinned histograms on a given subplot, when barmode is not 'overlay', get the same bin size and alignment
  • That size is the minimum any of them were auto-assigned
  • If there are any manually binned histograms in the group, match the autobinned ones to the first manual one:
    • More than one manual histogram, if they're inconsistent there's no way to fully satisfy the constraints so don't even bother looking at it, just use the first one.
    • In case the manual bins are larger than the autobins but not as large as the next larger autobin step, expand (and shift) the autobins to match exactly
    • Otherwise shrink the autobins to the next largest integer fraction of the manual bin size. Align the bin edges so that there is an autobin center at the center of each manual bin. In this case the manual bins will still be represented incorrectly when they're drawn (with the narrower width of the autobins) but there's nothing we can do about that (stacked/grouped bars must have matching widths) at least the autobins will be drawn with the correct width.

// with each other).
//
// TODO: there's probably a weird case here where a larger bin pushes the
// start/end out, then it gets shrunk and doesn't make sense with the smaller bin.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a small set of cases, by fixing the major problem (bin sizes getting drawn wrong) we introduce a smaller problem (bin edges get shifted so an undesirably large fraction of data is ambiguously right at bin edges). This is possible because we set minSize and minStart independently so they can come from different traces. The vast majority of the time though the bin edges are still positioned nicely at the end of this, and the fix would need to involve keeping minStart based on the same trace that gave minSize while expanding it to encompass all of these traces, which wouldn't be too hard for numeric bins but gets tricky for 'M<#>' bins. To me then this isn't worth fixing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind not fixing this either. It would be nice to add a test case to document this behavior though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested in ec12eab

* return an array of traces that are all stacked or grouped together
* TODO: only considers histograms. Should we also harmonize with bars?
* in principle people can mix and match these, but bars always
* specify their positions explicitly...
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try to treat bars the same as we treat manually-binned histograms in adjusting the autobins. That has more edge cases to consider though, nonuniform bar spacing being perhaps the hardest to manage, so I'm inclined not to worry about this one either. If someone is grouping/stacking bars with histograms I'd like to think it would be clear to them that they need to set the histogram binning to match the bar positions...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined not to worry about this one either

No worries for me either.

// the raw data was prepared in calcAllAutoBins (during the first trace in
// this group) and stashed. Pull it out and drop the stash
var pos0 = trace._pos0;
delete trace._pos0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe calcAllAutoBins should return a tuple e.g. [binspec, pos0]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps... we'd still need to stash _pos0 because it could be used in a totally separate call to calc, but at least that could all be dealt with inside calcAllAutoBins, so yeah that would be easier to understand. 👍 (unless we end up moving to setPositions...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reorg so _pos0 is only in calcAllAutoBins in 89093ad


// all but the first trace in this group has already been marked finished
// clear this flag, so next time we run calc we will run autobin again
if(trace._autoBinFinished) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should move auto-bins computations to histograms/set_positions (which loops over all traces of a given trace type) to avoid having to use _ flags like this one?

Copy link
Contributor

@etpinard etpinard Aug 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... maybe histogram/calc.js should only sanitize pos and size values and call arraysToCalcdata, and leave the rest to set_positions.js?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe histogram/calc.js should only sanitize pos and size values and call arraysToCalcdata, and leave the rest to set_positions.js?

Pro: that would avoid coupling between traces in a part of the code (calc) that's supposed to be operating on one trace at a time... and set_positions is only called after calc (as opposed to every replot), which is when it's needed.

Con: we'd need to be saving these sanitized pos0 during calc, rather than making an actual calcdata[i] array - or perhaps stashing it in calcdata[i][0] as we used to do with trace-wide stuff (and still do here and there) but then setPositions would need to fill in the rest of calcdata[i] once it determined the bin spec.

So I feel like in the end it would be more confusing that way. Thoughts?

Incidentally, it looks like we're probably calling setPositions twice unnecessarily if we have both histogram and bar traces on the same plot. I'll investigate...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I feel like in the end it would be more confusing that way. Thoughts?

Good point. I don't have a strong opinion. This whole calc vs set_positions debate was just something that came to mind, not anything blocking that's for sure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK cool - I'm going to leave it then, but this discussion will be useful to keep in mind for the future, I suspect at some point we will want to reimagine the whole pipeline to be a bit more flexible - particularly in regards to minimizing redraw work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, it looks like we're probably calling setPositions twice unnecessarily if we have both histogram and bar traces on the same plot. I'll investigate...

Yes we were. Fixed in e81fcb6 - I couldn't think of an easy way to test that we're not doing extra work, but we do have a test that this didn't break anything, in bar_and_histogram.json. I also had it avoid even doing the setPositions loop when there are no setPositions functions to call.

jasmine.addMatchers(customMatchers);
});

function _calc(opts, extraTraces) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR probably deserves an image test too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified test images in 3f173a4 - previously trace1 would have gotten bin size 2, so bin centers at 1 & 2 for trace0, 0.5, 2.5, and 4.5 for trace1 and displayed bar widths of 0.5. Now they all get bin size = displayed bar width = 1


return trace[binAttr];
return [trace[binAttr], pos0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Thanks!

@etpinard etpinard added this to the v1.30.0 milestone Aug 15, 2017
@etpinard
Copy link
Contributor

etpinard commented Aug 15, 2017

💃

though we should probably add a few examples on https://github.com/plotly/documentation to help out users at some point.


FYI @alexcjohnson, the next oldest bug is #109 but maybe I should take that one 😄

@alexcjohnson
Copy link
Collaborator Author

Haha you noticed my pattern 🥂 Some of them are getting a liiittle too old.

@jurasource
Copy link

This seems to generate errors when I have multiple traces. Only happens when I toggle one on/off a few times.
I managed to narrow down the bug to release 1.30.0.
No example I'm afraid, jsfiddle etc are blocked by my firewall.

plotly-latest.js:168259 Uncaught TypeError: Cannot read property 'length' of undefined at Object.calc (plotly-latest.js:168259) at Object.829.plots.doCalcdata (plotly-latest.js:152035) at 753.Plotly.plot (plotly-latest.js:130164) at Object.725.lib.syncOrAsync (plotly-latest.js:126168) at Object.restyle (plotly-latest.js:131297) at handleClick (plotly-latest.js:116491) at plotly-latest.js:116410

@etpinard
Copy link
Contributor

@jurasource that error looks similar to #2020 - can you confirm?

@jurasource
Copy link

@etpinard yes, it's the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants