Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PyCDS schema name agnostic #44

Merged
merged 15 commits into from
Jan 22, 2020
Merged

Conversation

rod-glover
Copy link
Contributor

@rod-glover rod-glover commented Jan 16, 2020

Resolves #32

This PR replaces the hard-coded schema name 'crmp' throughout PyCDS with a schema name determined by the value of the environment variable PYCDS_SCHEMA_NAME. If PYCDS_SCHEMA_NAME is not defined, the schema name defaults to 'crmp', making this change fully backward compatible with existing code.

To do:

  • Invoke tests with PYCDS_SCHEMA_NAME` defined
  • Update README

@rod-glover rod-glover force-pushed the i32-2-schema-name-agnostic branch from 0fa5f80 to 8c3f0eb Compare January 17, 2020 20:06
Copy link
Contributor

@jameshiebert jameshiebert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks straightforward to me and it seems to be a reasonable solution to set the schema at run time with an env var.

One comment: should we advise the user of the specific outcome of setting the schema? Your comment just says to "specify PYCDS_SCHEMA_NAME correctly" without advising what "correct" is. Is it sufficient (and correct) to say that PYCDS_SCHEMA_NAME must agree with the schema location of an existing database? And that for creating a new database PYCDS_SCHEMA_NAME may be set to anything?

One question: You've left logic in (in prefix()) to allow for unspecified schemas, however prefix() is never called with default args and crmp will always be used by default rather than going unspecified. That's not a problem, but was that intentional? Do you see a use case for unspecified schemas or should we continue to have everything fully qualified?

@@ -68,11 +74,12 @@
LANGUAGE plpgsql VOLATILE SECURITY DEFINER
COST 100
ROWS 1000;
''')
'''.format(prefix(schema)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? You're not going all-in on f-strings yet?! :p

@rod-glover
Copy link
Contributor Author

I've addressed both comments (and f-strings!) in the code. Good suggestions.

RETURNS text AS
$BODY$
stn_query = "SELECT * FROM crmp.getStationVariableTable(" + str(station_id) + ", false)"
stn_query = "SELECT * FROM getStationVariableTable(" + str(station_id) + ", false)"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{schema}.getStationVariableTable

@@ -29,6 +29,7 @@
"""

from sqlalchemy.schema import DDL
from pycds import get_schema_name
Copy link
Contributor Author

@rod-glover rod-glover Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a set of bugs, or at least shortcomings, introduced in these function definitions. The assumption we are now (mostly) making is that the search path is not to be relied on, and schema name is explicit everywhere. Except that is not the case here. These definitions were copied direct from the CRMP database definitions and they didn't include explicit schema names.

A couple of examples noted in comments below. The problem occurs wherever a database entity is referred to in a function - another function , a table, view, etc.

@rod-glover
Copy link
Contributor Author

@jameshiebert , regarding the lack of explicit schema name in function bodies pointed out above: I think I should interpolate schema names throughout. It's not hard to do, although without tests for every function, it would be easy to make an undetected mistake. Your thoughts?

@jameshiebert
Copy link
Contributor

I should interpolate schema names throughout.

Sounds good to me.

It's not hard to do, although without tests for every function, it would be easy to make an undetected mistake. Your thoughts?

Yes that's true, though with a careful read, I'd think you should be able to find them all.

@rod-glover
Copy link
Contributor Author

@jameshiebert , I had a little feeling that not all was well in all functions. The smoke tests proved out that intuition, and I corrected most of the problems. Definitely worth the effort.

Two problems remain: daily_ts and monthly_ts have errors in their definitions in the CRMP database and those errors were transcribed to this package. I will put in an issue to fix them, but that will require intervention by Faron, I think. I'll @ mention him in the issue. Meanwhile the failing tests are marked xfail.

Any further comments?

@jameshiebert
Copy link
Contributor

No further comments from me.

@rod-glover rod-glover merged commit a8c2193 into master Jan 22, 2020
@rod-glover rod-glover deleted the i32-2-schema-name-agnostic branch January 22, 2020 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make PyCDS schema (name) agnostic
2 participants