Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Skip the creation of secondary perms during catalog migrations #32043

Merged
merged 1 commit into from
Jan 30, 2025

Conversation

Vitor-Avila
Copy link
Contributor

SUMMARY

The migrations to add catalog permissions can take considerably long, as it would get all catalogs exposed to the credentials, then get all schemas for each of these catalogs and create these permissions accordingly.

This requires longer migrations that typically run with downtime.

This PR adds a flag to the config that allows to skip these permissions, as these can be created later once the app is up by just editing the DB connection.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

No UI changes.

TESTING INSTRUCTIONS

Unit tests added.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@github-actions github-actions bot added the risk:db-migration PRs that require a DB migration label Jan 30, 2025
@dosubot dosubot bot added the change:backend Requires changing the backend label Jan 30, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues.

Files scanned
File Path Reviewed
superset/migrations/shared/catalogs.py
superset/config.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Naming
Database Operations
Documentation
Logging
Error Handling
Systems and Environment
Objects and Data Structures
Readability and Maintainability
Asynchronous Processing
Design Patterns
Third-Party Libraries
Performance
Security
Functionality

Feedback and Support

Note

Korbit Pro is free for open source projects 🎉

Looking to add Korbit to your team? Get started with a free 2 week trial here

korbit-ai[bot]

This comment was marked as resolved.

@michael-s-molina
Copy link
Member

michael-s-molina commented Jan 30, 2025

This flag allows to skip the
creation of these secondary perms, and focus only on permissions for the default
catalog. These secondary permissions can be created later by editing the DB
connection via the UI (without downtime).

@Vitor-Avila if these migrations are optional, why are they executed by default in the first place? Isn't better to just remove them from the upgrade script instead of create a new feature flag? If they are not optional, what happens when they are skipped because of the feature flag?

Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

@Vitor-Avila
Copy link
Contributor Author

@Vitor-Avila if these migrations are optional, why are they executed by default in the first place? Isn't better to just remove them from the upgrade script instead of create a new feature flag? If they are not optional, what happens when they are skipped because of the feature flag?

Hey @michael-s-molina,

Before catalogs, to query something from another database in the DB instance/host users would have to do it via a virtual dataset using select * from <other_catalog>.<schema>.<table>. These "secondary permissions" are created so that after the migration, limited users need access on these other catalogs in order to query them in SQL Lab.

For existing datasets and charts, they'll all be associated with the default_catalog in their permission setting (even if they're querying other catalog) as we won't parse the query to determine (the same happens today if you create a dataset with schema_a selected in the dropdown and use select * from schema_b.table).

That said, ideally creating these permissions during the migration is a good thing, but we've seen cases where this process takes too long, and that causes a very long migration/downtime. For example, there used to be a Postgres Cloud service named ElephantSQL, and your credentials would be able to list all DBs in the instance (even though you actually don't have access to these other DBs).

These permissions are going to be added by the time you edit the DB configuration (either to enable the ability to change catalogs, or any other change actually).

I thought about adding it as a config because if you're able to run the migration with the config disabled, then it's great! All perms created. But if you start facing issues, a long downtime, then enabling the setting might be ideal.

Let me know your thoughts.

@betodealmeida
Copy link
Member

@Vitor-Avila if these migrations are optional, why are they executed by default in the first place? Isn't better to just remove them from the upgrade script instead of create a new feature flag? If they are not optional, what happens when they are skipped because of the feature flag?

@michael-s-molina we want to apply these migrations as soon as possible, because without them the data in non-default catalogs is vulnerable — users with access to schema foo in the default catalog will be able to access schemas with the same name in the non-default catalogs, which is probably not what the administrator wants.

In most cases the migration is quick and painless, but of course there are people with huge databases where this can take considerable time. So this is a compromise between security and performance.

@michael-s-molina
Copy link
Member

Got it. Thanks for the additional context @Vitor-Avila @betodealmeida!

@Vitor-Avila Vitor-Avila merged commit 3f46bcf into master Jan 30, 2025
48 of 51 checks passed
@Vitor-Avila Vitor-Avila deleted the chore/improve-catalog-migration branch January 30, 2025 21:29
@sadpandajoe sadpandajoe added the v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch label Jan 30, 2025
sadpandajoe pushed a commit that referenced this pull request Jan 30, 2025
@korbit-ai korbit-ai bot mentioned this pull request Mar 6, 2025
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:backend Requires changing the backend risk:db-migration PRs that require a DB migration size/L v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants