Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to visualize 'Time Series - Line Chart' -- Error message is unclear #5700

Closed
3 tasks done
HectorLira opened this issue Aug 22, 2018 · 17 comments
Closed
3 tasks done
Labels
inactive Inactive for >= 30 days

Comments

@HectorLira
Copy link

HectorLira commented Aug 22, 2018

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if any
  • I have reproduced the issue with at least the latest released version of superset
  • I have checked the issue tracker for the same issue and I haven't found one similar

Superset version

0.26.3

Expected results

Time Series - Line Chart rendered.

Actual results

Error with no clear message.

Steps to reproduce

  1. Create a query in SQL Lab with a datetime column and an integer column (or just use any datasource that has a datetime column and group by that column to count the number of rows)

  2. Click on 'Explore' to visualize the resulting table
    image

  3. Choose Time Series - Line Chart as an option to visualize

  4. Add your integer column as value for the 'Metrics' option

  5. Click on 'Run Query'
    image

Here is the trackeback I get:

Traceback (most recent call last):
  File "/home/work/incubator-superset/superset/views/core.py", line 1115, in generate_json
    payload = viz_obj.get_payload()
  File "/home/work/incubator-superset/superset/viz.py", line 347, in get_payload
    payload['data'] = self.get_data(df)
  File "/home/work/incubator-superset/superset/viz.py", line 1217, in get_data
    df = self.process_data(df)
  File "/home/work/incubator-superset/superset/viz.py", line 1137, in process_data
    values=utils.get_metric_names(fd.get('metrics')))
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 5303, in pivot_table
    margins_name=margins_name)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/reshape/pivot.py", line 61, in pivot_table
    raise KeyError(i)
KeyError: 'SUM(total_visits)'

The traceback doesn't seem to be helpful.

This is the query that is generated automatically to visualize the resulting table:

SELECT weekly AS __timestamp,
       SUM(total_visits) AS "SUM(total_visits)"
FROM
  (SELECT DATE_TRUNC('WEEK', "timestamp") weekly,
          COUNT(*) total_visits
   FROM my_table
   WHERE "timestamp" >= '2018-06-01'
   GROUP BY 1) AS expr_qry
WHERE weekly >= '1918-08-22 00:00:00'
GROUP BY weekly
ORDER BY "SUM(total_visits)" DESC
LIMIT 50000;

I've tried the following to see if it works:

  • grouping the data by the column 'weekly' in the 'Query' options,
  • adding a value for Time Grain,
  • trying to visualize using both Chrome and Mozilla Firefox,
  • using a different time column in the table to see if there was any conflict caused by the name of the column being 'timestamp'
  • changing the 'Metrics' definition from 'SUM(total_visits)' to 'total_visits' -> This produces another message:
    image

image

  • after the step above, I tried to group by the columns I was told to, but another message pops out:
    image
@HectorLira
Copy link
Author

So, I managed to find a solution. Looking at the discussion in Issue #4770, it didn't feel quite the same issue as the one displayed here.

This is how I managed to solve it:

  1. I went to the tab 'Charts'
  2. I clicked on 'Edit records' in the corresponding chart
  3. In the 'Parameters' section, I looked for the variable with name 'label' and changed it from SUM(weekly_returning_users) to sum(weekly_returning_users)
    image

Comparing to Issue #4770, I am editing the labels used in the Chart and not the datasource column names.

@villebro
Copy link
Member

villebro commented Aug 23, 2018

@HectorLira what database engine are you using? Furthermore, have you tested 0.27 or master branch? 0.26 branch was still using dbapi2 for queries which lowercased column names in the query result, while 0.27+ uses dbapi1 which retains the original case of the query. So I suspect this issue should not persist in 0.27+.

@HectorLira
Copy link
Author

My database engine is Amazon Redshift. I have tried using PostgreSQL and the problem doesn't seem to appear there. Haven't tried 0.27+ but let me try using it and report what I see.

@villebro
Copy link
Member

@HectorLira Redshift is one of three engines that has been causing a lot of grief when used with Superset. There's a PR in the making that I hope will tackle these problems for good. I would appreciate it if you can help testing it once it is done (main functionality is ready, working on some final details).

@rae89
Copy link

rae89 commented Sep 6, 2018

@villebro I am encountering the same issue as @HectorLira, but the solution of lowercasing the sum does not seem to fix the issue for me. I am running on docker image on master branch: https://github.com/apache/incubator-superset/tree/master/contrib/docker

Any idea how to check which version of superset the docker image is running?

@villebro
Copy link
Member

villebro commented Sep 6, 2018

Thanks @rae89 for the feedback. As Superset recently migrated to using dbapi1 for queries, the quickfix now is uppercasing everything rather than lowercasing column bames. I'm working on a fix for Redshift in #5827 and should have something to test later today, and would appreciate testing help/feedback, especially from Redshift users.

@rae89
Copy link

rae89 commented Sep 6, 2018

@villebro I am willing to help with testing out changes for redshift. Which version should I be trying out? And since I am using the docker contrainer to test out superset, how do I change the superset version in the docker container, or is the container building the version that is checked out?

@villebro
Copy link
Member

villebro commented Sep 7, 2018

Awesome @rae89 ; please test my branch from PR #5827 . There's many ways you can try it out, but a quick and dirty way would be to clone the branch into /tmp as follows:

cd /tmp
git clone -b timestamp_label https://github.com/villebro/incubator-superset.git

@pc042672
Copy link

I face the same issue. I use postgresql

I will get Unexpected error while using time series char if I select time grain. (my date time type is timestamp without time zone)

screen shot 2018-09-11 at 12 55 38 am

screen shot 2018-09-11 at 12 55 32 am

screen shot 2018-09-11 at 12 55 22 am

@HectorLira
Copy link
Author

HectorLira commented Sep 10, 2018

@pc042672 I was getting the same error when using the time series chart and selecting time grain. I was able to use time grain successfully by building the same query from SQL Lab and declaring a time zone for my time column. Here is an image of the final query Superset uses to graph and now I can use the time grain freely (even with filter boxes). The name of my time column is timestamp, this is why I have to use double quotes to reference the column.
superset

@villebro
Copy link
Member

@pc042672 would you mind opening a new issue for this? I have a fix for the original issue that's nearing completion and don't mind working on the postgres timestamp time grain issue too, but I would prefer to keep them in two separate PRs to isolate any potential problems they might introduce.

@millar
Copy link

millar commented Sep 11, 2018

I am facing the same issue but with BigQuery as our database engine.

Both with and without a time grain I get a yellow warning message stating: 'SUM(pings)'. The table visualisation works fine but never the line chart.

Here's the query it generates:

SELECT TIMESTAMP_TRUNC(created, DAY) AS `__timestamp`,
       SUM(`pings`) AS `SUM_pings_`
FROM `table`
WHERE `created` >= '2018-08-14 00:00:00'
  AND `created` <= '2018-09-11 13:30:56'
GROUP BY `__timestamp`
ORDER BY `SUM_pings_` DESC
LIMIT 5000;

@jdavidheiser
Copy link

It looks like the PR that solves this is approved - is there any word on when it will ship? This is pretty much blocking us from using Superset seriously right now.

@villebro
Copy link
Member

@jdavidheiser if you have the opportunity to try out the PR (#5827) I would appreciate any feedback. Also @millar sorry for having missed the comment in September; the aforementioned PR should fix that, too. The postgres error that was referenced here earlier I believe was fixed in #6453 .

@stale
Copy link

stale bot commented Apr 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the inactive Inactive for >= 30 days label Apr 10, 2019
@villebro
Copy link
Member

@HectorLira I believe this can be closed.

@stale stale bot removed the inactive Inactive for >= 30 days label Apr 11, 2019
@stale
Copy link

stale bot commented Jun 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Jun 10, 2019
@stale stale bot closed this as completed Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive Inactive for >= 30 days
Projects
None yet
Development

No branches or pull requests

6 participants