-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad y axis scaling with color aesthetic #560
Comments
Fixed for dodge now. Still need to decide what to do with stack. Your second problem statistics clobbering is because, currently there in only allowed one statistic per layer, so they do get clobbered. I think I avoided allowing multiple stats because it would make arguments to |
I had not really considered stacked histograms with a log scale. I agree that there is no optimal solution, but I feel that ggplot2's is more misleading than your naive approach (although in your examples I think the larger block of color goes on bottom, not on top. Although it is possible that I am thinking about this wrong). In general I am not a huge fan of stacking. Even on a linear scale they are pretty hard to look at. I mean, if you look at my stacked plot above, do you really feel like you have any idea what the distribution of I colored diamonds is? I certainly do not. I guess what I am saying is it, don't sweat it too much. It will be confusing no matter what you do. With "dodge" (or my hand rolled overlap) the only problem is what to do if you want to somehow normalize the different distributions. This can be useful if you want to make a comparison of the tails or something, but you happen to have a different number of samples of each population. density=true normalizes to unit area (I believe) which makes for some very crazy viewing on a log scale. Ideally each would be rescaled to a common area such that each bin had at least a height of one. Anyway, not sure what something like this would look like ideally. As for the clobbering, I thought something like that might be happening. I am not sure I have a brilliant idea about what general combinations of stats should mean. I do think that using Geom.step for histograms (especially when you want to plot a bunch of overlapping ones) is fairly desirable though. Again, thanks for the speedy response. |
👍 to your proposal for proportional stacking with the correct summed heights for log histograms. Usually, the only useful information I can glean out of stacked plots is a rough sense of proportion for any given region of the histogram so that makes a lot of sense to me. And bar heights being order sensitive just gives me the willies. |
Yeah, this seems like the best option (or maybe the worst option except for all the others). There's still the potential to mislead, but not as badly as the alternatives, and the resulting plot is pretty useful. |
When I try the opaque bars in different layers, I get an error:
|
You may have to be on Gadfly master to do that. I haven't tagged a new version in a while. |
This is all in current master of Gadfly and Julia 0.3.7.
Currently using Scale.y_log10 with the color aesthetic bound produces bad scaling. The following code:
produces
but if we try to bind color:
this is the result:
Clearly the y scale is way off. A similar thing happens with a linear y axis and position=:dodge. The default stacked position scales correctly:
But the dodge position does not:
The correct behavior can be manually simulated with layers (Alpha channel to the rescue!) but it is slightly awkward (and this is only for three of the seven colors):
As an aside (or maybe separate issue) this sort of plot can work very well as an step plot (where only the outline of this histogram is shown) instead of a bar style histogram. Sadly Stat.histogram seems to clobber Stat.step so Geom.step with Stat.histogram is the same as Geom.line with Stat.histogram:
The text was updated successfully, but these errors were encountered: