-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test harness to ensure alembic up/downgrades work as intended #3244
Comments
Tagging @conorsch and @redshiftzero for review of this proposal before I completely implement it. |
Here are some more detailed thoughts about what I think needs to be in place testing-wise prior to shipping database migrations. The testing story here could balloon into a pretty significant task, so first here's the possible approaches I have in my mind ordered from least to most effort:
So if we took option 2 and implement a simple data script, that uses the state of the models on the latest release to load the data (this script we would need to update for pre-release testing for any release involving a migration), the story around testing/release would look something like this: Pre-release testing for release N
If we know that a particularly tricky migration step is occurring in a particular release, then we can include additional checks verifying the data integrity in that table, etc. as part of the pre-release test plan. Per-PR testing for PRs with migrationsReviewers would inspect migrations added in PRs to ensure that:
tl;dr I claim the highest priority code we need for shipping database migrations is not an automated approach (indeed, even if we had an automated approach, we'd still want a realistic data import script for pre-release testing) is the realistic data loading script for pre-release upgrade testing. |
I should have fulled written out the functions in this example code instead of stubbing them with def load_data(self):
with self.app.app_context():
db.engine.execute(text('''INSERT INTO ...'''))
self.loaded_ids = dict(...)
...
def check_upgrade(self):
with self.app.app_context():
for row in db.engine.execute(text('''SELECT id, username FROM ...''')):
assert row['id'] in self.loaded_ids.keys()
... This would be option 3 you described above, and I think it's absolutely necessary. Manual QA is also going to be useful, but was have cases the UI won't catch such as what happens when we enforce foreign keys but there could be a violation? Also, the data dump script would need to include printouts to the terminal telling the QA reviewer exactly what to look for (e.g., this source has this name and this many submissions and this many stars), and at that point we might as well just script it, no?
If we did do this, we would just call the script This is also a problem because if we have a release CI is super slow right now and building Debian packages is fairly slow. Having tests that can be run with I could add a test called My tl;dr is that a dummy data script is not going to catch enough edge cases and is going to be harder to maintain than a series of tests that are isolated to the scope of "a single migration." |
During a call with @redshiftzero we agreed that both the manual test via a data dump and the per-migration tests are a hard requirements for Proposal for data dump
|
Most issues caused by the content in a column (e.g. special characters in a field) will only come up if we start doing tricky data migrations. In the next ~6 months of the project we will be likely only adding columns and tables and won't hit these issues. But, for the sake of future maintainers of the project I agree we should have a representative data dump script such that there are so surprises. Some specific thoughts:
Note for the interested observer: an older wordlist was used up until 0.4 (ref: #1235) so there might be long-lived source codenames that are actually numbers from the old wordlist. But since we're not directly storing the codename this shouldn't be an issue. |
@redshiftzero So there's something just a tiny bit about using |
While I think a single manual step like this wouldn't be a terrible thing since it's for pre-release QA, recall that the app-test role is used in the staging environment, so we can add this dependency to the |
Feature request
Description
Subtask of #1419
We need a test framework that ensures that every
alembic
database upgrade and downgrade applies cleanly.As a note, whatever we do to load data into the database cannot depend on
models.py
because the database schema will always matchalembic
'shead
revision, so using it to test an upgrade makes no sense since we need to use the out-of-date models. This means we have to hand craft code to do the inserts and selects that we want.Initial Proposal
Branch:
migration-test-harness
This proposal uses dynamic module loading and pytest parametrization to ensure that every new migration applies the tests in the same way. If a migration is not present, running
make test
will error out because it won't be able to find the correct module.securedrop/securedrop/tests/test_alembic.py
Lines 109 to 159 in cebd70f
User Stories
As a dev, I want to know that my DB migration scripts work in a prod-like environment.
The text was updated successfully, but these errors were encountered: