Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream Django SQL queries and add flag to toggle their streaming #111

Merged
merged 7 commits into from
Jan 8, 2019

Conversation

hasier
Copy link
Contributor

@hasier hasier commented Nov 1, 2018

Currently, only SQLAlchemy reports a sanitized SQL query in the subsegment metadata. Also, it is always attached, with no option to turn it off.

These changes introduce a flag (enabled by default to keep the current behaviour) to allow toggling that feature. There is also new functionality to attach the SQL queries coming from the Django ORM.

A new variable is also read from the Django settings for convenience, which functions the previously mentioned toggle.

@hasier hasier closed this Nov 1, 2018
@hasier
Copy link
Contributor Author

hasier commented Nov 1, 2018

Sorry didn't intend to open this PR here, just testing around in own fork

@hasier
Copy link
Contributor Author

hasier commented Nov 8, 2018

Updated the branch, should be ready for review.

@hasier hasier reopened this Nov 8, 2018
@hasier hasier changed the title Add params and functions to allow recording and streaming SQL queries Stream dbapi2 SQL queries and add flag to toggle their streaming Nov 8, 2018
@@ -14,6 +14,7 @@
'DYNAMIC_NAMING': None,
'STREAMING_THRESHOLD': None,
'MAX_TRACE_BACK': None,
'STREAM_SQL': False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late response. If by default the sql should captured, should this be set to True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, thanks for your comment. I meant it is enabled by default as this is the current behaviour for SQLAlchemy, so I wanted to keep it. But for the rest (in this case Django), I have set it off by default, as it is the behaviour previous to this PR. Hope this makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I would not consider an enhancement to be concerned. We have some other enhancements that added more data and haven't seen any issue.

The SQLAlchemy query capture was also submitted as a PR. In fact the SDK means to have query capture as the default behavior. It's just due to security concerns it has different development and review cycle. I would suggest to keep the query capture behavior consistent, which that behavior is to have parameterized query ready whenever possible. Thoughts?

Copy link
Contributor Author

@hasier hasier Dec 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the default behaviour, I agree, I'll change it so that capture is enabled by default.

Regarding the parametrised queries, afaik there is no way to ensure that the incoming query is a parametrised one... DBAPI2 (PEP 249) recommends this behaviour, but then it is up to the implementation to decide how the queries and parameters are formatted.
Is this the concern you have? Would it be better to attempt the patch in a different place? The only one I can think of is to attempt to patch the Django ORM (I haven't tried, so I don't know how feasible it is). I mention the Django ORM as, for example, psycopg2 also allows different Cursor classes to be used with its connections, which might lead to the same situation again.
But it is also true that it is this XRay SDK the one that controls which drivers are patched, as the patch needs to be included in a module here. As far as I can see, those that implement DBAPI2 and are included here for now are just Django ORM, SQLAlchemy and SQLite, so I guess it should not be that big a concern?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's another thing I'm going to mention besides the default config value. For SQLAlchemy the patcher actually works on the implementation level and there are tests against parameterized queries. https://github.com/aws/aws-xray-sdk-python/blob/master/tests/ext/sqlalchemy/test_query.py.

DBAPI level query capture probably will not always work but if it does work for the current patchers (Django or SQlite) I'm OK to move forward with an internal security review. If not I would suggest to have the toggle function just for SQLAlchemy and everything else in the separate PRs for each actual patcher.

You can see the PR for SQLAlchemy query capture PR here: #34

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the patch for DBAPI2 and just did so for Django. Please let me know if it makes sense now.

@hasier hasier changed the title Stream dbapi2 SQL queries and add flag to toggle their streaming Stream Django SQL queries and add flag to toggle their streaming Jan 7, 2019
@haotianw465 haotianw465 merged commit ff716e8 into aws:master Jan 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants