Force lowercase column names for Snowflake and Oracle #4994

villebro · 2018-05-12T19:25:28Z

When calling cursor.description using the Oracle and Snowflake SQL Alchemy connectors, column names are returned in all uppercase. This causes problems when these column names are used to match column names in DataFrames that have been constructed from a ResultProxy instance, which for these connectors return lowercase column names when calling the keys() function. An example of this is Pandas' read_sql_query function, which is used in the Viz component.

This PR should fix #4662, #4770 and #953. A long-term sustainable fix would be replacing this and other cursor.execute()-based select queries with connection.execute()calls to ensure uniform column names. This should be fairly simple and would probably tidy up the code a fair bit, unless there is an explicit reason for using the low level API.

…/incubator-superset into db_engine_normalize_col

codecov-io · 2018-05-12T20:50:45Z

Codecov Report

Merging #4994 into master will decrease coverage by <.01%.
The diff coverage is 83.33%.

@@            Coverage Diff            @@
##           master   #4994      +/-   ##
=========================================
- Coverage    77.1%   77.1%   -0.01%     
=========================================
  Files          44      44              
  Lines        8636    8644       +8     
=========================================
+ Hits         6659    6665       +6     
- Misses       1977    1979       +2

Impacted Files	Coverage Δ
superset/sql_lab.py	`74.31% <100%> (-0.14%)`	⬇️
superset/db_engine_specs.py	`52.86% <77.77%> (+0.39%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d5195a...631704a. Read the comment docs.

mistercrunch · 2018-05-13T16:37:22Z

I think there are instances where we need to use the lower level API, we're likely to move towards always using the lower level API to get more engine-specific control.

Is oracle/snowflake always case unsensitive? Can case-sensitivity ever be turned on? If so would the current PR still work?

villebro · 2018-05-13T16:59:26Z

Ok, that's fine, too, as long as we're always consistently using the same API. Would the plan be to add querying methods to the db_engine_spec? This way we could have more control over the details later.

With regards to the details on Oracle/Snowflake case logic, I think this hack works fine for now, at least much better than if left untouched. Reading the docs both Oracle and Snowflake return case insensitive column names as uppercase (I'm assuming the docs refer to cursor.description), but if an uppercase column name is sent to SQLAlchemy, it treats that as case sensitive, and encloses the column name in quotes, resulting in the problem that we've observed. I can't come up with a scenario where this fix/hack would break anything, as any query that was previously working was already case insensitive, hence would be unaffected by lowercasing the column names (at least that's my interpretation).

I've actually raised the cursor.description vs resultproxy.keys() inconsistency with Snowflake and hope to get some help from them on that front, either guidance on how to use the connector or a change to it's behaviour/room for customisation of engine_params in future versions. The details surrounding the Oracle connector aren't as familiar to me, but can look into that in the coming days. At any rate, the current problem stems from mixing cursor with connection, so this problem should ultimately go away if all db connectivity is streamlined.

* Force lowercase column names for Snowflake and Oracle * Force lowercase column names for Snowflake and Oracle * Remove lowercasing of DB2 columns * Remove DB2 lowercasing * Fix test cases

mmuru · 2018-07-24T23:13:06Z

I tried to verify this fix against Snowflake and Superset 0.26.3 (latest release) and the issue is still exist. The workaround #4662 and #4770 only works and verified.

* Force lowercase column names for Snowflake and Oracle * Force lowercase column names for Snowflake and Oracle * Remove lowercasing of DB2 columns * Remove DB2 lowercasing * Fix test cases

villebro · 2018-07-25T13:48:07Z

@mmuru That's strange, this effectively does the same thing that the workarounds do. However, this was just a hack, and will no longer work due to unrelated changes in master branch. However, I currently have a local build that is working very well with Snowflake (based on master branch), but still have to to do some additional testing. I should have a fix ready in a day or so if you want to try it out. The patch should actually work well for 0.26.3, too, if you want to backport it.

mistercrunch · 2018-07-25T15:31:44Z

Should I revert?

villebro · 2018-07-25T15:45:05Z

@mmuru did you refer to PR #5467 or this PR #4994 ? #5467 won't work for 0.26.3 as it requires changes that are present in master branch. @mistercrunch I would recommend not reverting in master, as the old workaround is not compqtible with the ned db api 1 logic. I will submit a new PR soon that should fix this problem permanently.

mistercrunch · 2018-07-25T15:52:21Z

Ok let's push a fix through then.

mmuru · 2018-07-25T16:38:56Z

@villebro: I was referring your PR #4994 change set. The #4662 workaround simply outside Superset and manually setting requires_name_normalize=False. This one was very nice but one has to manually override snowflake-sqlalchemy code. The #4770 workaround is manually edit and lower case columns within Superset. Sure, let me know when your fix is ready and I will test and verify it.

villebro · 2018-07-25T21:20:21Z

@mmuru Check out #5487 . Feedback much appreciated!

* Force lowercase column names for Snowflake and Oracle * Force lowercase column names for Snowflake and Oracle * Remove lowercasing of DB2 columns * Remove DB2 lowercasing * Fix test cases

villebro added 6 commits May 12, 2018 21:56

Force lowercase column names for Snowflake and Oracle

e7b8ba8

Force lowercase column names for Snowflake and Oracle

d718831

Remove lowercasing of DB2 columns

ff40be4

Merge branch 'db_engine_normalize_col' of https://github.com/villebro…

22c42aa

…/incubator-superset into db_engine_normalize_col

Remove DB2 lowercasing

fb5e9b0

Fix test cases

631704a

mistercrunch merged commit b391676 into apache:master May 14, 2018

villebro deleted the db_engine_normalize_col branch July 3, 2018 05:17

villebro mentioned this pull request Jul 3, 2018

KeyError when using Redshift #5308

Closed

villebro mentioned this pull request Jul 23, 2018

Fix broken dedup and remove redundant db_spec logic #5467

Merged

simond mentioned this pull request Nov 6, 2018

Recent Refactor of db_engine_specs.py causes KeyError again on Redshift sources. #6162

Closed

3 tasks

yuyuhan0306 mentioned this pull request Apr 25, 2019

Charts built off Snowflake only support uppercase metric labels #7219

Closed

3 tasks

mrshu mentioned this pull request Mar 5, 2021

KeyError when using uppercase Metrics with Athena (in Superset 1.0.1) #13481

Closed

3 tasks

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.26.0 labels Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force lowercase column names for Snowflake and Oracle #4994

Force lowercase column names for Snowflake and Oracle #4994

villebro commented May 12, 2018 •

edited

Loading

codecov-io commented May 12, 2018

mistercrunch commented May 13, 2018

villebro commented May 13, 2018 •

edited

Loading

mmuru commented Jul 24, 2018

villebro commented Jul 25, 2018

mistercrunch commented Jul 25, 2018

villebro commented Jul 25, 2018

mistercrunch commented Jul 25, 2018

mmuru commented Jul 25, 2018

villebro commented Jul 25, 2018

Force lowercase column names for Snowflake and Oracle #4994

Force lowercase column names for Snowflake and Oracle #4994

Conversation

villebro commented May 12, 2018 • edited Loading

codecov-io commented May 12, 2018

Codecov Report

mistercrunch commented May 13, 2018

villebro commented May 13, 2018 • edited Loading

mmuru commented Jul 24, 2018

villebro commented Jul 25, 2018

mistercrunch commented Jul 25, 2018

villebro commented Jul 25, 2018

mistercrunch commented Jul 25, 2018

mmuru commented Jul 25, 2018

villebro commented Jul 25, 2018

villebro commented May 12, 2018 •

edited

Loading

villebro commented May 13, 2018 •

edited

Loading