-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Force lowercase column names for Snowflake and Oracle #4994
Conversation
…/incubator-superset into db_engine_normalize_col
Codecov Report
@@ Coverage Diff @@
## master #4994 +/- ##
=========================================
- Coverage 77.1% 77.1% -0.01%
=========================================
Files 44 44
Lines 8636 8644 +8
=========================================
+ Hits 6659 6665 +6
- Misses 1977 1979 +2
Continue to review full report at Codecov.
|
I think there are instances where we need to use the lower level API, we're likely to move towards always using the lower level API to get more engine-specific control. Is oracle/snowflake always case unsensitive? Can case-sensitivity ever be turned on? If so would the current PR still work? |
Ok, that's fine, too, as long as we're always consistently using the same API. Would the plan be to add querying methods to the With regards to the details on Oracle/Snowflake case logic, I think this hack works fine for now, at least much better than if left untouched. Reading the docs both Oracle and Snowflake return case insensitive column names as uppercase (I'm assuming the docs refer to cursor.description), but if an uppercase column name is sent to SQLAlchemy, it treats that as case sensitive, and encloses the column name in quotes, resulting in the problem that we've observed. I can't come up with a scenario where this fix/hack would break anything, as any query that was previously working was already case insensitive, hence would be unaffected by lowercasing the column names (at least that's my interpretation). I've actually raised the |
* Force lowercase column names for Snowflake and Oracle * Force lowercase column names for Snowflake and Oracle * Remove lowercasing of DB2 columns * Remove DB2 lowercasing * Fix test cases
* Force lowercase column names for Snowflake and Oracle * Force lowercase column names for Snowflake and Oracle * Remove lowercasing of DB2 columns * Remove DB2 lowercasing * Fix test cases
* Force lowercase column names for Snowflake and Oracle * Force lowercase column names for Snowflake and Oracle * Remove lowercasing of DB2 columns * Remove DB2 lowercasing * Fix test cases
@mmuru That's strange, this effectively does the same thing that the workarounds do. However, this was just a hack, and will no longer work due to unrelated changes in |
Should I revert? |
@mmuru did you refer to PR #5467 or this PR #4994 ? #5467 won't work for 0.26.3 as it requires changes that are present in master branch. @mistercrunch I would recommend not reverting in master, as the old workaround is not compqtible with the ned db api 1 logic. I will submit a new PR soon that should fix this problem permanently. |
Ok let's push a fix through then. |
@villebro: I was referring your PR #4994 change set. The #4662 workaround simply outside Superset and manually setting requires_name_normalize=False. This one was very nice but one has to manually override snowflake-sqlalchemy code. The #4770 workaround is manually edit and lower case columns within Superset. Sure, let me know when your fix is ready and I will test and verify it. |
* Force lowercase column names for Snowflake and Oracle * Force lowercase column names for Snowflake and Oracle * Remove lowercasing of DB2 columns * Remove DB2 lowercasing * Fix test cases
When calling
cursor.description
using the Oracle and Snowflake SQL Alchemy connectors, column names are returned in all uppercase. This causes problems when these column names are used to match column names inDataFrames
that have been constructed from aResultProxy
instance, which for these connectors return lowercase column names when calling thekeys()
function. An example of this is Pandas'read_sql_query
function, which is used in the Viz component.This PR should fix #4662, #4770 and #953. A long-term sustainable fix would be replacing this and other
cursor.execute()
-based select queries withconnection.execute()
calls to ensure uniform column names. This should be fairly simple and would probably tidy up the code a fair bit, unless there is an explicit reason for using the low level API.