Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add elasticsearch db.statement sanitization #1598

Merged

Conversation

nemoshlag
Copy link
Member

@nemoshlag nemoshlag commented Jan 24, 2023

Description

Added an optional query sanitizer to the elasticsearch instrumentation.
Usage
ElasticsearchInstrumentor().instrument(sanitize_query=True)

This will affect the DB_STATEMENT value to contain the original query or sanitized one.

Fixes #1545
Following the specification discussion here

Type of change

Please delete options that are not relevant.

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Test A db.statement has been sanitized
  • Test B no side affects occurred on a query without a db.statement attribute

Does This PR Require a Core Repo Change?

  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@nemoshlag nemoshlag requested a review from a team January 24, 2023 10:28
Copy link
Contributor

@avzis avzis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a description of the param 'sanitize_query' in the init file (line 54)

@shalevr
Copy link
Member

shalevr commented Jan 24, 2023

Can we use the same sanitization function for all of the db libraries?

@nemoshlag
Copy link
Member Author

I think that this could be an option for all SQL-based DBs, but if we focus on "specific sanitization" for all DB libraries this has some potential to over complicated this issue

CHANGELOG.md Outdated
- `opentelemetry-instrumentation-redis` Add `sanitize_query` config option to allow query sanitization. ([#1572](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1572))
- Add default query sanitization for elasticsearch db.statement attribute
([#1545](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1545))
- `opentelemetry-instrumentation-redis` Add `sanitize_query` config option to allow query sanitization.
Copy link
Member

@shalevr shalevr Feb 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merge with the main accidentally adds this entry

CHANGELOG.md Outdated
@@ -9,7 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- `opentelemetry-instrumentation-redis` Add `sanitize_query` config option to allow query sanitization. ([#1572](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1572))
- Add default query sanitization for elasticsearch db.statement attribute
([#1545](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1545))
Copy link
Member

@shalevr shalevr Feb 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your PR is not #1598 ?

@shalevr
Copy link
Member

shalevr commented Feb 5, 2023

LGTM

@srikanthccv srikanthccv merged commit df32e8c into open-telemetry:main Feb 10, 2023
@nemoshlag nemoshlag deleted the add/elasticsearch-sanitization branch February 13, 2023 12:01
@phillipuniverse
Copy link
Contributor

FYI this does not work with the elasticsearch bulk API, see #1868

@@ -135,11 +138,16 @@ def _instrument(self, **kwargs):
tracer = get_tracer(__name__, __version__, tracer_provider)
request_hook = kwargs.get("request_hook")
response_hook = kwargs.get("response_hook")
sanitize_query = kwargs.get("sanitize_query", False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the default be True? See also open-telemetry/opentelemetry-specification#3104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement sensitive data sanitization for elasticsearch instrumentation
6 participants