-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression in 3.0.0 #1052
Comments
Hi @iandanforth , thanks for the report and the reproducible example. This will be really helpful for future performance optimization. Construction Here's what I recommend. Use objects from the When I make these changes in your example, it runs in under 2 seconds. %%time
import numpy as np
import plotly.graph_objs as go
import colorlover as cl
from plotly.offline import plot
sim_duration = 40.0
time_inc = 0.1
motor_unit_count = 120
times = np.arange(0.0, sim_duration, time_inc)
# Setting colors for plot.
potvin_scheme = [
'rgb(115, 0, 0)',
'rgb(252, 33, 23)',
'rgb(230, 185, 43)',
'rgb(107, 211, 100)',
'rgb(52, 211, 240)',
'rgb(36, 81, 252)',
'rgb(0, 6, 130)'
]
# It's hacky but also sorta cool.
c = cl.to_rgb(cl.interp(potvin_scheme, motor_unit_count))
c = [val.replace('rgb', 'rgba') for val in c]
c = [val.replace(')', ',{})') for val in c]
def get_color(trace_index: int) -> str:
# The first and every 20th trace should be full opacity
alpha = 0.2
if trace_index == 0 or ((trace_index + 1) % 20 == 0):
alpha = 1.0
color = c[trace_index].format(alpha)
return color
# Per Motor Unit Force
all_array = np.ones((120, len(times)))
data = []
annotations = []
anno_offsets = {
0: 20,
19: 30,
39: 40,
59: 45,
79: 17,
99: 56,
119: 170
}
max_y = np.amax(all_array)
for i, t in enumerate(all_array):
trace = dict(
x=times[:1],
y=t[:1],
name=i + 1,
marker=dict(
color=get_color(i)
)
)
data.append(trace)
frames = []
for i in range(1, len(times), 10):
frame_data = []
for j, t in enumerate(all_array):
trace = dict(
x=times[:i],
y=t[:i],
name=j + 1,
marker=dict(
color=get_color(j)
)
)
frame_data.append(trace)
frame = dict(
data=frame_data
)
frames.append(frame)
layout = dict(
title='Motor Unit Forces by Time',
yaxis=dict(
title='Motor unit force (relative to MU1 tetanus)',
range=[0, max_y],
autorange=False
),
xaxis=dict(
title='Time (s)',
range=[0, sim_duration],
autorange=False
),
updatemenus=[{
'type': 'buttons',
'buttons': [{
'args': [
None,
{'frame': {'duration': 200, 'redraw': False},
'fromcurrent': True,
'transition': {'duration': 200, 'easing': 'linear'}
}
],
'label': 'Play',
'method': 'animate'
}]
}]
)
layout['annotations'] = annotations
fig = dict(
data=data,
layout=layout,
frames=frames
)
plot(fig, filename='regression.html', validate=False, auto_open=False) I hope that's helpful. |
@jonmmease Thanks for taking the time to look into this! This fix is very helpful for me. My remaining concern is one of optics. If someone tries to do much-data-many-frames-wow! and finds plotly slow it may deter them from using the library. To address that I recommend creating a ticket to modify the animation documentation to include exactly your suggestions. I knew to use 'redraw=False' only because of those docs. (Which are very useful and well put together!) Thanks again! |
I'm glad this fix is working for you and thanks for the helpful suggestion regarding the documentation. I'm going to leave this issue open until I have a chance to profile your example and see what can be done to make the situation a bit less painful. |
+1 for this issue. Rather than submit another ticket, I have attached my own example of poor 3.0.0 performance. On my machine, the difference was .013 seconds to 30 seconds so ~2000 times slower. As per above I got out of trouble by extracting the data as lists of dicts using the _data property and operating on them in that form, re-creating a modified figure later.
this is the performance figures for the first call
Note that most of the time is held up here....
|
Thanks for the example and the profiling information @JavascriptMick, I'll see what I can come up with... |
Alright, I've made some progress in #1061 that will be included in version 3.0.2 @iandanforth Your initial example has gone from ~110s to ~42s @JavascriptMick The creation of the dendrogram has gone from ~110s to ~20s, and the scaling that you were timing has gone from ~20s to ~2s. That said, these are still far slower than version 2.7 where the Each of your use cases has ~5000 traces in total, and for cases like these I think we'll need to encourage people to work with |
Thanks @jonmmease. Agree, with very large traces, lists and dict are the way to go. The _data property that exposes the raw data currently support this so i guess make sure that these remain in the API. The only other thing I can think of is providing a more elegant way of combining multiple plots.This ultimately was my goal when I discovered the issue (creating a Clustermap visualisation from a Heatmap and 2 dendrograms). All of this spinning and placing things on alternate axes feels clumsy. I understand that FigureFactories are the recommended answer to this and it is on my ToDo list to create a ClusterMap Figurefactory for this and raise a PR but It's unclear to me how much work this will be.. probably a lot. |
Hi @JavascriptMick , I'd rather people not get in the habit of using the What do you think of the proposal in #1079 ? If this were implemented you could request that the figure factory return a Also, the dendrogram itself with that many points is a bit of a mess visually. I'm thinking we should expose the scipy Here's one thought for making the construction of clustermap easier (short of creating a dedicated figure factory). For figure factories that only deal with a single axis (like dendrogram), it would be nice if you could pass in a |
It seems i can reproduce the same issue with this code : import plotly
import time
import plotly.plotly as py
import plotly.figure_factory as FF
print('plot version', plotly.__version__)
t1 = time.time()
z = []
for i in range(0, 100):
z.append([j for j in range(0, 100)])
figure = FF.create_annotated_heatmap(z)
print('Timing', time.time() - t1) With plotly version 3.10.0, i get around 24 seconds, versus 2.9 seconds for version 2.7.0. Is it the right place to raise the issue ? |
Hi - we are currently trying to tidy up Plotly's public repositories to help us focus our efforts on things that will help users most. Since this issue has been sitting for several years, so I'm going to close it. If it's still a concern, we'd be grateful if you could open a new issue (with a short reproducible example if appropriate) so that we can add it to our backlog. Thanks for your help - @gvwilson |
OS: OSX 10.13.2
Python: 3.6.4
I'm creating the following animated chart using the offline API.
![plotly-animation](https://user-images.githubusercontent.com/446062/42426044-5ae40b4c-82d9-11e8-8c47-a88b5e37802a.gif)
Using 2.7.0 generating the HTML file (30MB) takes ~11 seconds (0m11.044s)
Using 3.0.0 generating the HTML file takes: 4 MINUTES (4m20.393s).
There is probably a much more efficient way of making this kind of animated scatter (please tell me if there is) but in 2.7 this was at least workable, in 3.0, not so much.
The following example reproduces the behavior. The size of the data is the same as in the gif above, the actual values are not.
The text was updated successfully, but these errors were encountered: