Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to plot two lines with different colors? #526

Open
ViralBShah opened this issue Jan 5, 2015 · 15 comments
Open

How to plot two lines with different colors? #526

ViralBShah opened this issue Jan 5, 2015 · 15 comments

Comments

@ViralBShah
Copy link
Collaborator

On the Gadfly website, there are several examples with dataframes, where different data series are colored differently.

I was writing a simple tutorial, where I wanted to give multiple x and y vectors, and have each of them plotted with a different color. I just couldn't find a simple way to do that. In Matlab, this is easily accomplished by giving multiple inputs such as plot(x, y1, x, y2).

If I understand correctly, it should be possible to do this easily with layers, but is it possible to do it as easily as it is in matlab?

Cc: @shashi

@aviks
Copy link
Collaborator

aviks commented Jan 6, 2015

Gadfly is actually significantly easier to use with dataframes. In particular, it is by far easier to bind color to a column in the frame. Anything else is a bit awkward

However, if you did want to plot only with arrays, one way to do what you want is as follows:

plot(layer( x=[1:10], y=rand(10),Geom.point, Geom.line, Theme(default_color=color("orange")) ),
      layer( x=[1:10], y=rand(10),Geom.point, Geom.line, Theme(default_color=color("purple"))) )

This produces something like this:

screen shot 2015-01-05 at 23 59 52

@ViralBShah
Copy link
Collaborator Author

I wish there were a way to make that syntax a lot more compact, of course without special casing anything.

@dcjones
Copy link
Collaborator

dcjones commented Jan 6, 2015

It's true that things can get ugly if the data isn't a data frame, or at least tabular.

I was thinking about a syntax to make this thing easier a while ago: #89 (comment)

I'm simultaneously impressed that I remember a comment I made a year ago and depressed that I never did anything about it. I hate to add special cases or alternative syntax (e.g. I think qplot in ggplot2 is a mistake), and generally prefer consistency to compactness, but this comes up pretty frequently, and my usual advice ("put your vectors in a data frame, then use melt to reshape it into form Gadfly expects") isn't very satisfying.

@timholy
Copy link
Collaborator

timholy commented Jan 6, 2015

There are also performance considerations stemming from needing to force everything into a DataFrame (GiovineItalia/Compose.jl#105 (comment)).

But, I agree 100% that this is not an easy question to answer well. It's really hard to support many different APIs simultaneously, and I too would be quite cautious about trying.

@ViralBShah
Copy link
Collaborator Author

I think there are lots of users who do not need to use DataFrames, but would love to use Gadfly. I also agree that I don't want special casing.

@johansigfrids
Copy link

Even with DataFrames I find myself wishing I could just pass multiple columns to the y aesthetic and save me a lot of stacking and melting. And not just for lines, most of Gadfly's Geoms could take advantage of it.

@kzapfe
Copy link

kzapfe commented Feb 6, 2015

I also think the DataFrame thing is quite awkaward. I have a DataFrame in which the first column is the x value and the next 65 columns are y values. I cannot find a easy way to plot them all, just indexing the colors by column number. I have read both Gadfly's and DataFrames documentations in detail and there seems to be none.

@lobingera
Copy link

This DataFrame vs. other input to be organized as different lines in the same plot. There was some time ago on julia-users a discussion about generalizing plot-APIs. Maybe a "generalized input heuristic" (read as: some code that determines what can be plotted from the input material e.g. vector(y) -> x: enumerate elements, y: y; complex(c) -> x: real(c), y: imag(c); matrix m [n x 2] -> x: m[:,1], y: m[:,2] and similar) could be the starter. If there's more than one "set" available, plotting will be asked, to e.g. cycle colors or markers...

@tbreloff
Copy link
Contributor

I'm developing a plotting interface with Gadfly as the first guinea pig (not counting Qwt, which is my package). This issue is old, but I think still very relevant... take a look (https://github.com/tbreloff/Plots.jl) and especially check out the examples for Gadfly:

https://github.com/tbreloff/Plots.jl/blob/master/docs/gadfly_examples.md

I'm eagerly awaiting peoples opinions on the API, and to gauge people's opinions on where I should prioritize my time.

@Abhdez
Copy link

Abhdez commented Oct 25, 2016

use another column in your data, and use the color attribute. the plot will take and classify that column in to different colors. When you are manipulating data you ideally want a big table with variables as columns. In this examples, each X in your table will give you an Y value also in your table, an in a third column you would write to what function it corresponds, may be X^2, 2X, e^-x .. etc.. "color" would be the "legend". This is the easiest way, and it is a proper way to manipulate data. gl!
plot(df, x=:Xvalues, y=:Yvalues, color=:Functions,
Geom.line)

@bjarthur
Copy link
Member

bjarthur commented Aug 11, 2017

worth noting that in most places strings can be used as colorants as the are automatically sent to parse(Colorant,.... so Theme(default_color="red") should work. see #998

@miromarszal
Copy link

I believe what I want to do is the same issue, but I can open a new one if necessary.

I often find myself fitting an analytical model to some data and plotting this data along with the model function. I don't mind storing the data in a DataFrame, but at the same time I want to avoid tabulating the fitted function. In Gadfly, I would plot it like this:

l1 = layer(df, x=:time, y=:vals, Geom.line)
l2 = layer(t->model(t, param), extrema(df[!,:time])..., Geom.line)
plot(l1, l2)

This plots the two lines with the same color. To have them in different colors, I thought I can do the following:

l1 = layer(df, x=:time, y=:vals, Geom.line, color=["data"])
l2 = layer(t->model(t, param), extrema(df[!,:time])..., Geom.line, color=["model"])
plot(l1, l2)

but this, instead of displaying the plot, prints (I'm using Jupyter):

Plot(...)

To my surprise, if I change the geometry in the first layer to points, it will plot everything just fine:

l1 = layer(df, x=:time, y=:vals, Geom.point, color=["data"])
l2 = layer(t->model(t, param), extrema(df[!,:time])..., Geom.line, color=["model"])
plot(l1, l2)

This could be a sort of a workaround, but often a line plot is the most natural way to show what we want, e.g. when data is dense and has some fine detail. Plotting 1e4 data points brings Jupyter nearly to a halt.

What is going on? Why doesn't it work with two Geom.lines and at the same time does work with Geom.points + Geom.line?

@Mattriks
Copy link
Member

See #1459 , #1463 and #1465. This has been fixed on Gadfly master (]add Gadfly#master). More improvements like this are coming soon! Note with your above example (and in Jupyter) you can see which layer is causing the issue above by doing e.g. draw(PNG(), plot(l1)) and draw(PNG(), plot(l2)) .

@miromarszal
Copy link

That indeed works on master, great!

@Mattriks
Copy link
Member

Mattriks commented Aug 5, 2020

Also please look at #1430, and add any changes there about color syntax that you would like to see!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests