Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurred when trying to visualize a group of 4-million-row data #12052

Closed
3 tasks done
lanyusea opened this issue Dec 15, 2020 · 3 comments
Closed
3 tasks done
Labels
!deprecated-label:bug Deprecated label - Use #bug instead explore:error Related to general errors of Explore

Comments

@lanyusea
Copy link

lanyusea commented Dec 15, 2020

I got a group of data about 4million stored in mysql, I'm trying to load it in superset and do the virtualization.

It works well when there is only thousands data but after I put all data in the mysql, the superset failed to do the job with error An error occurred in the chart page but no further information, nor in the log.

Expected results

Superset can fetch the data though it reports problems

Actual results

  1. It says An error occurred with no result.
  2. a large temp db is created in my mysql

Screenshots

image

and the log file has nothing useful:
image

How to reproduce the bug

  1. Add 4million rows of data
  2. do the virtualization

Environment

(please complete the following information):

  • superset version: 0.37.2
  • python version: 3.6.9
  • node.js version: didn't install

superset is installed from Scratch, not docker

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

I have searched and find few discussion about the data scale. the official document only mentions issue about the loading speed while in #4588 people also talks about the speed. Seems nobody meets a fail issue as me.

also, to make it runable, I have extended the timeout to 3000s and several other limit settings.
image

The problem always happen after querying about 600s, so I'm thinking if there is still some timeout in the connection but I didn't find any other in the superset config.py.

So I'm wondering how could I know what the exact error is, and how to I solve it.

or is there any suggestion I can bypass this issue.

Thanks!

@lanyusea lanyusea added the !deprecated-label:bug Deprecated label - Use #bug instead label Dec 15, 2020
@lanyusea
Copy link
Author

I just found the superset will resend the query request if the result doesn't go back in around 120s. So if the mysql failed to finish the query in 120s, it will run into a terrible condition until all resources used up.

anyone know how could I modify this retry behavior?
didn't find any related settings in the config file.

@zuzana-vej zuzana-vej added the explore:error Related to general errors of Explore label Apr 20, 2021
@robdiciuccio
Copy link
Member

@lanyusea can you elaborate on the MySQL retry logic you mentioned?

I see that you increased the SUPERSET_WEBSERVER_TIMEOUT value in superset_config.py. How are you running Superset? Gunicorn? Do you have nginx or another web server proxying requests to the app? These each have their own timeout settings as well.

@lanyusea
Copy link
Author

@robdiciuccio by gunicorn, I managed to increase the timeout in gunicorn thread setting, and it works.

but it still cannot handle a long connection when the query costs more than 20 minutes. currently I bypass this issue by speeding up the query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
!deprecated-label:bug Deprecated label - Use #bug instead explore:error Related to general errors of Explore
Projects
None yet
Development

No branches or pull requests

3 participants