Skip to content

Commit 569fc02

Browse files
authored
Merge pull request #194 from pacificclimate/i-180-crmp_network_geoserver
Update view CrmpNetworkGeoserver with variable tags column
2 parents 9a135d7 + 4be2fdd commit 569fc02

17 files changed

+417
-106
lines changed

README.md

+6-3
Original file line numberDiff line numberDiff line change
@@ -19,16 +19,19 @@ With this package, one can recreate the database schema in a [PostgreSQL](http:/
1919
- [ORM contents and usage](docs/orm.md)
2020
- Database operations with Alembic
2121
- [Introduction](docs/database-operations/introduction.md)
22-
- [Creating a new migration](docs/database-operations/create-new-migration.md)
2322
- [Applying a migration: Upgrade](docs/database-operations/migrate-upgrade.md)
2423
- [Applying a migration: Downgrade](docs/database-operations/migrate-downgrade.md)
2524
- [Creating a new PyCDS database](docs/database-operations/create-new-db.md)
26-
- [Creating the initial migration](docs/database-operations/create-initial-migration.md)
2725
- Testing
2826
- [Project unit tests](docs/testing/project-unit-tests.md)
2927
- [Test migrations with a test database](docs/testing/test-migrations.md)
3028
- [Unit tests in client code](docs/testing/unit-tests-in-client-code.md)
31-
- Development notes
29+
- Development
30+
- Migrations
31+
- [Introduction](docs/dev-notes/migrations/introduction.md)
32+
- [IMPORTANT ADVICE AND GUIDELINES](docs/dev-notes/migrations/important-notes.md)
33+
- [Creating a new migration](docs/dev-notes/migrations/create-new-migration.md)
34+
- [Creating the initial migration](docs/dev-notes/migrations/create-initial-migration.md)
3235
- [Creating and using SQLAlchemy extensions](docs/dev-notes/sqlalchemy-extensions.md)
3336
- [Creating and using Alembic extensions](docs/dev-notes/alembic-extensions.md)
3437

docs/database-operations/create-new-migration.md

-56
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Creating a new migration
2+
3+
After the PyCDS ORM has been modified, you will usually want to create an Alembic migration script to enable existing databases to be upgraded to the new model.
4+
5+
Alembic can generate a migration script for you. It can generate an empty script, or autogenerate a script containing changes inferred by comparison between the ORM and an existing database. The command for this is `alembic revision ...`.
6+
7+
In the context of PCIC's databases (CRMP, Metnorth), autogeneration is less useful than one might hope. PCIC's databases have not been managed solely through PyCDS, and so diverge from the ORM definitions. Much of this divergence has been cleaned up, but there are still differences in some table definitions, indexes and constraints. Furthermore, the autogenerate function is only aware of tables and table-related objects. It does not handle functions, views, and matviews, and these are critical to our databases' operation.
8+
9+
In this situation, autogenerate generates draft migrations containing a great deal of spurious and undesirable content, and also omits all content related to non-table objects. Fixing this actually adds unnecessary work.
10+
11+
Therefore, developers usually create a new migration _without_ using the `--autogenerate` flag. The result is a skeleton migration script with empty `upgrade` and `downgrade` functions. The benefits are that it generates a unique revision identifier, creates an appropriately named file in the correct directory, and inserts the migration into the migration sequence automatically. See [Create a Migration Script](https://alembic.sqlalchemy.org/en/latest/tutorial.html#create-a-migration-script) for an example.
12+
13+
## Generating a new migration script (without autogeneration)
14+
15+
Instructions:
16+
17+
1. Generate the migration script:
18+
19+
```shell script
20+
alembic revision -m "<message>"
21+
```
22+
23+
Notes:
24+
- `PYCDS_SU_ROLE_NAME` is not required for this operation.
25+
- The `<message>` should be a succinct description of what the migration accomplishes; for example "Add name column to Users". This message becomes part of the name and content of the script.
26+
27+
2. Alembic writes a new script to the directory `pycds/alembic/versions`. Its name includes a unique revision identifier (a SHA) and a version of `<message>`.
28+
29+
3. Add code to create, modify, and/or drop commands for all items to be managed in this migration. (These commands are on the object `alembic.op`, which is imported in the skeleton script.)
30+
31+
1. **Ensure that all changes respect the specified schema name** that the user will supply when this migration is applied. In particular:
32+
1. It's useful to obtain the specified schema name with `schema=pycds.get_schema_name()`.
33+
2. Ensure the specified schema name is used for all objects, including functions, views and materialized views. This should come without special effort due to the structure of how these items are declared, but it is worth verifying.
34+
35+
2. Do this for both upgrade and downgrade functions in the script!
36+
3. Data and schema migrations are not handled separately and should be included as part of the migration where applicable.
37+
38+
4. Write some tests for the migration, including tests for the data migration where applicable. Examples can be found in the existing code.
39+
40+
5. Commit the new migration script and its tests to the repo.
41+
42+
For more information, see [Create a Migration Script](https://alembic.sqlalchemy.org/en/latest/tutorial.html#create-a-migration-script).
43+
44+
Modify the script to perform the necessary actions for upgrading and downgrading a database to/from this revision. For useful examples, see other scripts in directory `pycds/alembic/versions`.
45+
46+
## Autogenerating a new migration script
47+
48+
As noted above, **_autogeneration is not usually helpful_**. However, here are some instructions if you wish to do so. For more details, see the [Alembic documentation](https://alembic.sqlalchemy.org/en/latest/autogenerate.html).
49+
50+
To autogenerate a migration script, you must have a reference database schema for Alembic to compare to the modified PyCDS ORM. After Alembic autogenerates the script, you must edit it to ensure completeness, correctness, and that it respects the specified schema name (see instructions below).
51+
52+
Instructions:
53+
54+
1. Choose or create a reference database schema that is at the latest migration. (Or a database schema at an otherwise desired "base" migration, which is unusual but not necessarily wrong.)
55+
To serve as a reference, a schema need not contain any data.
56+
57+
1. If the reference database is not named in `alembic.ini`, create an entry there for it of the form
58+
59+
```ini
60+
[<db-label>]
61+
sqlalchemy.url = postgresql://<user-name>@<server-name>/<db-name>
62+
```
63+
64+
1. Autogenerate the migration script:
65+
66+
```shell script
67+
[PYCDS_SCHEMA_NAME=<schema name>] alembic -x db=<db-label> revision --autogenerate -m "<message>"
68+
```
69+
70+
Notes:
71+
- `PYCDS_SU_ROLE_NAME` is not required for this operation.
72+
- The `<message>` should be a succinct description of what the migration accomplishes; for example "Add name column to Users". This message becomes part of the name and content of the script.
73+
74+
1. Alembic writes a new script to the directory `alembic/versions`. Its name includes a unique revision identifier (a SHA) and a version of `<message>`.
75+
76+
1. Review and edit the script to ensure correctness, completeness, and that it respects the specified schema name. Specifically:
77+
78+
1. There are (still) significant differences between the CRMP schema and this ORM. Expect to remove or alter many of the autogenerated operations.
79+
80+
2. Review to ensure that it picks up all changes to tables and implements them appropriately.
81+
In particular, Alembic does not pick up on changes of table or column name, which must be manually converted from "drop old name, add new name" to "rename". Some other schema changes are also not detected.
82+
83+
3. For more information, see [What does Autogenerate Detect (and what does it not detect?)](https://alembic.sqlalchemy.org/en/latest/autogenerate.html#what-does-autogenerate-detect-and-what-does-it-not-detect).
84+
85+
4. Manually add creation or dropping of the following things not handled by Alembic autogenerate:
86+
1. Functions
87+
1. Views
88+
1. Materialized views
89+
90+
5. **Ensure that all changes respect the specified schema name** that the user will supply when this migration is applied. In particular:
91+
1. Replace any autogenerated schema name usage (e.g., `schema='crmp'`) with the specified schema name (`schema=pycds.get_schema_name()`).
92+
1. Ensure the specified schema name is used for functions, views and materialized views. This should come without special effort due to the structure of how these items are declared, but it is worth verifying.
93+
94+
6. Do this for both upgrade and downgrade functions in the script!
95+
96+
1. Write some tests for the migration. Examples can be found in the existing code.
97+
98+
1. Commit the new migration script and its tests to the repo.
99+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# IMPORTANT ADVICE AND GUIDELINES
2+
3+
As noted elsewhere, creating a migration is not in principle very complicated. But in this project, where we manage many non-table objects, there are complications. We try to document those complications here.
4+
5+
## Order of activity
6+
7+
1. Replaceable objects require multiple versions to be retained under the revision identifier (see notes below). So the first step is usually to create a new skeleton migration from which the revision identifier can be obtained.
8+
9+
2. Modify the ORM to reflect the changes to be implemented. This can include:
10+
11+
- Modifying a table (e.g., column definitions)
12+
- Creating a new version of a replaceable object to be created or updated (e.g., a function, view or matview)
13+
14+
3. Complete the migration script using the newly defined objects. See the next sections for some important advice on doing that.
15+
16+
## Multiple versions of the same replaceable object
17+
18+
For each migration script to work in perpetuity without error, **_all_** versions of replaceable objects must be retained in the codebase. A given migration will use only one or perhaps two distinct versions of a replaceable object, but there may be many such versions corresponding to many revisions (and attendant migrations).
19+
20+
For example, a view or materialized view may acquire new columns and have the query defining its columns updated accordingly. But because earlier migrations involving that matview depend on earlier versions of that object, those versions of the matview must also be retained.
21+
22+
### Locations of versions of replaceable objects
23+
24+
All object definitions are defined in the module `pycds.orm`. This module is further subdivided into object types such as tables (`pycds.orm.tables`), views (`pycds.orm.views`), etc. Whereas tables are mutated and do not need to have old versions retained, replaceable objects require all versions to be retained separately. Within each replaceable object type module (e.g., `pycds.orm.views`), there are separate version modules named according to the version in which one or more replaceable objects in that version is (re)defined (e.g., `pycds.orm.views.version_bb2a222a1d4a`).
25+
26+
(This organization is a little awkward in practice, but logical. Other and possibly better ways of organizing things are conceivable.)
27+
28+
## Obtaining the correct version of replaceable objects in migrations
29+
30+
Recall that a replaceable object (a.k.a. non-table object) is one that cannot be modified in-place in the way that a table can. Instead (prompting the name), such an object must be replaced in its entirety.
31+
32+
**In a migration script that updates a replaceable object, it is _essential_ to obtain the old and new replaceable objects directly from the version directories where they are defined.** This ensures that the correct version is dropped and/or created.
33+
34+
When dependent objects must be dropped and recreated in the course of updating an primary object, ensure you get each dependent object from the correct version directory, and not from the top-level `pycds` module. If you do not do this, migrations will fail or potentially update the database with the wrong objects.
35+
+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Introduction
2+
3+
## Essential references
4+
5+
The Alembic documentation is our fundamental reference. Note: The documentation appears only to be available for the latest version of Alembic (1.13 at time of writing), which is significantly ahead of the version in use in this project (1.6 atow). It must be read with this in mind; wherever things do not work as documented, this may be the answer. Examining the code may be a way to clarify issues.
6+
7+
Here is a list of key pages in the Alembic documentation:
8+
9+
- [Alembic documentation](https://alembic.sqlalchemy.org/en/latest/index.html)
10+
- [Alembic tutorial](https://alembic.sqlalchemy.org/en/latest/tutorial.html)
11+
- [Alembic cookbook](https://alembic.sqlalchemy.org/en/latest/cookbook.html)
12+
- [Operation reference](https://alembic.sqlalchemy.org/en/latest/ops.html)
13+
14+
## Existing Alembic infrastructure in project
15+
16+
When you have installed this project locally:
17+
18+
- Alembic is installed. You can use Alembic commands on the command line.
19+
- The Alembic [migration environment](https://alembic.sqlalchemy.org/en/latest/tutorial.html#the-migration-environment) has been created and customized.
20+
- Alembic-specific content is in the directory `pycds/alembic`, following standard Alembic practice.
21+
- There are a number of already-written migration scripts in `pycds/alembic/versions`.
22+
- Extensions to support non-table objects (e.g., functions, views, matviews) have already been written.
23+
- Extensions to Alembic proper are in `pycds/alembic/extensions`.
24+
- Extensions to SQLAlchemy are in `pycds/sqlalchemy`. These support the Alembic extensions, although they can be and are used elsewhere too.
25+
26+
## Development activities
27+
28+
In the abstract, creating a migration is a relatively straightforward activity. This project, however, is fairly complicated, and writing migrations is correspondingly more complicated.
29+
30+
In particular, managing non-table objects (functions, views, matviews, etc.) is not supported by Alembic out of the box. We have written extensions to support migrations involving them. Additionally, such objects are not mutable, and so must be dropped and replaced in their entirety -- unlike tables, in which elements (e.g., columns) can be modified in-place.
31+
32+
Such non-table objects are called here _replaceable objects_.

pycds/alembic/versions/22819129a609_convert_collapsed_vars_mv_to_native_.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
from pycds.orm.views.version_22819129a609 import (
1616
CollapsedVariables as CollapsedVariablesView,
1717
)
18-
from pycds import CrmpNetworkGeoserver
18+
from pycds.orm.views.version_84b7fc2596d5 import CrmpNetworkGeoserver
1919

2020

2121
# revision identifiers, used by Alembic.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
"""update CrmpNetworkGeoserver
2+
3+
Revision ID: 6cb393f711c3
4+
Revises: fecff1a73d7e
5+
Create Date: 2024-01-05 13:06:02.811787
6+
7+
"""
8+
import logging
9+
from alembic import op
10+
from pycds import get_schema_name
11+
from pycds.orm.views.version_84b7fc2596d5 import (
12+
CrmpNetworkGeoserver as OldCrmpNetworkGeoserver,
13+
)
14+
from pycds.orm.views.version_6cb393f711c3 import (
15+
CrmpNetworkGeoserver as NewCrmpNetworkGeoserver,
16+
)
17+
18+
19+
# revision identifiers, used by Alembic.
20+
revision = "6cb393f711c3"
21+
down_revision = "fecff1a73d7e"
22+
branch_labels = None
23+
depends_on = None
24+
25+
26+
logger = logging.getLogger("alembic")
27+
schema_name = get_schema_name()
28+
29+
30+
def drop_view(view):
31+
op.drop_replaceable_object(view, schema=schema_name)
32+
33+
34+
def create_view(view):
35+
op.create_replaceable_object(view, schema=schema_name)
36+
37+
38+
def upgrade():
39+
drop_view(OldCrmpNetworkGeoserver)
40+
create_view(NewCrmpNetworkGeoserver)
41+
42+
43+
def downgrade():
44+
drop_view(NewCrmpNetworkGeoserver)
45+
create_view(OldCrmpNetworkGeoserver)

pycds/alembic/versions/bf366199f463_convert_station_obs_stats_mv_to_native_.py

+5-4
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,18 @@
88
import logging
99
from alembic import op
1010
import sqlalchemy as sa
11-
from sqlalchemy.dialects import postgresql
1211
from pycds import get_schema_name
1312
from pycds.orm.native_matviews.version_bf366199f463 import (
1413
StationObservationStats as StationObservationStatsMatview,
1514
)
1615
from pycds.orm.views.version_bf366199f463 import (
1716
StationObservationStats as StationObservationStatsView,
1817
)
19-
from pycds import CrmpNetworkGeoserver
18+
19+
# Important: We must obtain replaceable database objects from the appropriate revision
20+
# of the database. Otherwise, later migrations will cause errors in earlier migrations
21+
# due to a mismatch between the expected version and the latest (head) version.
22+
from pycds.orm.views.version_84b7fc2596d5 import CrmpNetworkGeoserver
2023

2124
# revision identifiers, used by Alembic.
2225
revision = "bf366199f463"
@@ -30,12 +33,10 @@
3033

3134

3235
def drop_dependent_objects():
33-
"""What it says on the box"""
3436
op.drop_replaceable_object(CrmpNetworkGeoserver)
3537

3638

3739
def create_dependent_objects():
38-
"""What it says on the box"""
3940
op.create_replaceable_object(CrmpNetworkGeoserver)
4041

4142

pycds/database.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66

77
def check_migration_version(
8-
executor, schema_name=get_schema_name(), version="fecff1a73d7e"
8+
executor, schema_name=get_schema_name(), version="6cb393f711c3"
99
):
1010
"""Check that the migration version of the database schema is compatible
1111
with the current version of this package.

0 commit comments

Comments
 (0)