-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: improve schema security #23385
Conversation
cfa64e7
to
d93863b
Compare
d93863b
to
69d07a1
Compare
Codecov Report
@@ Coverage Diff @@
## master #23385 +/- ##
==========================================
+ Coverage 67.44% 67.55% +0.10%
==========================================
Files 1907 1907
Lines 73493 73546 +53
Branches 7976 7976
==========================================
+ Hits 49571 49687 +116
+ Misses 21873 21810 -63
Partials 2049 2049
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 9 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
69d07a1
to
8e04ad8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice improvement! Thanks for this hard work, this will set a much better foundation for schema security! 👍
""" | ||
# default schema varies on a per-query basis | ||
if cls.supports_dynamic_schema: | ||
return query.schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betodealmeida judging from the description, the dropdown from sqllab has already been assigned as query.schema at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right!
SUMMARY
Superset supports permissions for schema-level access control; for example, a given user role might have access to only the schema
foo
in a given database. The problem is that it's really hard to figure out the schema of an unqualified table name. Eg, for the following query:When a given user runs the query above, Superset needs to know what is the schema where table
bar
lives, so that it can check if the user has permission to access that schema. But figuring out the schema is not trivial, and depends on the database engine spec and how the database configured.For example, for Postgres the schema will usually be
public
, the default one, regardless of the schema selected in SQL Lab. But it's possible to specify a different search path when creating the database, eg:For a database configured as above, the schema for the table
bar
in the query would besecret
, and notpublic
. To make things worse, the search path in Postgres can point to multiple schemas, in which case Superset has no idea where the data is coming from!Other databases can specify the default schema in the SQLAlchemy URI; MySQL is one of them (though it calls it a "database"):
In this case we can fetch the default schema for unqualified table names from the URI.
Finally, some database engine specs (like Trino, Presto, Snowflake, MySQL) can modify the schema dynamically on a per-query basis, in which case we can simply use the schema that is selected in the dropdown in SQL Lab:
superset/superset/db_engine_specs/presto.py
Lines 303 to 316 in 870bf6d
To improve the situation this PR introduces a new method to DB engine specs,
get_default_schema_for_query
, along a couple helper methods. This method allows the security manager to know in which default schema a given query is running, so it can validate access to unqualified table names.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION