-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update WSGI & Flask integrations to follow new semantic conventions #299
Conversation
Codecov Report
@@ Coverage Diff @@
## master #299 +/- ##
==========================================
- Coverage 85.66% 84.28% -1.39%
==========================================
Files 33 33
Lines 1612 1648 +36
Branches 180 195 +15
==========================================
+ Hits 1381 1389 +8
- Misses 185 204 +19
- Partials 46 55 +9
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments, nothing blocking.
# pylint:disable=too-many-branches,too-many-return-statements | ||
if code < 100: | ||
return StatusCanonicalCode.UNKNOWN | ||
if code <= 299: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it would be worth adding a check for 200 at the top since i suspect that's going to be by large the status code we hit. Possibly a premature optimization though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't want to guess. 204 No Content is probably also a popular response code to POST requests and the like. I think before that I'd rather do things like the classical _StatusCanonicalCode=StatusCanonicalCode
default parameter to avoid accessing globals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels to me that, rather than this range, just creating a dict with hard-coded values would be more performant. Might be worth a benchmark but would make behavior consistent regardless of what codes are common.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean a dict with 500 entries?
"http.server_name": environ["SERVER_NAME"], | ||
"http.scheme": environ["wsgi.url_scheme"], | ||
"host.port": int(environ["SERVER_PORT"]), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there ever a case where any of these environment variables wouldn't be present? If so using environ.get
may be better, unless raising an exception is preferable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to PEP3333, these are required: https://www.python.org/dev/peps/pep-3333/#environ-variables. I guess there might be nonconforming implementations out there. I have no strong opinion here. Maybe it would be better to come up with a convention of using try:
/except:
to make sure the integrations don't break applications.
if flavor.upper().startswith(_HTTP_VERSION_PREFIX): | ||
flavor = flavor[len(_HTTP_VERSION_PREFIX) :] | ||
if flavor: | ||
result["http.flavor"] = flavor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use setifnotnone
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not, since I use ""
as second argument to get
above, in order to unconditionally use upper
and startswith
. I think there is really no way to save a conditional here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right!
Conflicts: ext/opentelemetry-ext-flask/src/opentelemetry/ext/flask/__init__.py ext/opentelemetry-ext-flask/tests/test_flask_integration.py ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py ext/opentelemetry-ext-wsgi/tests/test_wsgi_middleware.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and I think it couldn't hurt to protect against non-PEP3333-compliant envs as in https://github.com/open-telemetry/opentelemetry-python/pull/299/files#r350626307.
@Oberon00 let me know when you think this is ready to merge |
Unless you think I should address the non-PEP3333-compliant environ safety issues first, I think this is ready (but it might cause merge conflicts for the named tracers PR #301). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, looks like you had enough, but I guess another one wouldn't hurt :)
"http.status_code": 404, | ||
"http.status_text": "NOT FOUND", | ||
}, | ||
{"http.status_code": 404, "http.status_text": "NOT FOUND"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random and more general question: is it needed to have "status_text"? feels to me like it adds a lot to the payload and is often identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the status text is optional. But note that the integration does not insert static text here but actual "output" from the WSGI application. EDIT: See spec: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md#common-attributes
# pylint:disable=too-many-branches,too-many-return-statements | ||
if code < 100: | ||
return StatusCanonicalCode.UNKNOWN | ||
if code <= 299: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels to me that, rather than this range, just creating a dict with hard-coded values would be more performant. Might be worth a benchmark but would make behavior consistent regardless of what codes are common.
elif not url.startswith(environ["wsgi.url_scheme"] + ":"): | ||
# Something fishy is in RAW_URL. Let's fall back to request_uri() | ||
url = wsgiref_util.request_uri(environ) | ||
def setifnotnone(dic, key, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should follow the underscore naming convention, even for this small method name.
else: | ||
url = wsgiref_util.request_uri(environ) | ||
result["http.url"] = wsgiref_util.request_uri(environ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels to me like http.url could be decomposed into http.target and http.host or something along those lines: otherwise if I wanted the target, I would have to know and check for url in the case that target did not exist, and perform a split to extract that myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we could decompose, but I see no benefit in it. Who do you mean with "myself" above? I think the back-end is a better place for such logic, the integrations should send information to the back-end as "raw" as they want as more raw usually means more info.
Updates flask & WSGI integration to follow new semantic conventions for HTTP as of open-telemetry/opentelemetry-specification#263 (see https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md).
The HTTP client (requests) integration will need an update too, and for that the wsgi
opentelemetry.ext.wsgi.http_status_to_canonical_code
will need to be used by the requests-instrumentation -- either by just adding a dependency (since ext-wsgi is and will stay dependency-free itself, this would technically not be a problem), or by splitting this function into a new package. This will be done in a follow-up PR.