Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instrumentation-http uses the wrong method to parse URLs for outgoing requests #5060

Closed
gcampax opened this issue Oct 9, 2024 · 5 comments · Fixed by #5091
Closed

instrumentation-http uses the wrong method to parse URLs for outgoing requests #5060

gcampax opened this issue Oct 9, 2024 · 5 comments · Fixed by #5091
Assignees
Labels
bug Something isn't working has:reproducer This bug/feature has a minimal reproducer provided pkg:instrumentation-http priority:p1 Bugs which cause problems in end-user applications such as crashes, data inconsistencies, etc

Comments

@gcampax
Copy link

gcampax commented Oct 9, 2024

What happened?

instrumentation-http intercepts all requests made with the http or https node modules, but if the first argument is a string, it uses url.parse() to convert the string url into the RequestOptions object, instead of new URL() as documented by nodejs (in recent versions).
This causes a discrepancy between instrumented and non-instrumented requests, in particular with regards to requests that have unescaped non-ASCII Unicode characters in the path or query

Steps to Reproduce

Have a script that makes a request to a URL containing a non-ASCII Unicode character (not percent encoded).

Expected Result

The request succeeds, as if the character was % encoded.
(This is what happens if the request is not instrumented)

Actual Result

TypeError: Request path contains unescaped characters

OpenTelemetry Setup Code

I'm not using OpenTelemetry directly. OpenTelemetry is setup by the Sentry node js SDK.

package.json

@opentelemetry/instrumentation-http: 0.53.0

Relevant log output

No response

@gcampax gcampax added bug Something isn't working triage labels Oct 9, 2024
@olabodeIdowu
Copy link

olabodeIdowu commented Oct 15, 2024

@gcampax, here’s a sample script that demonstrates how to use OpenTelemetry to instrument an HTTP request to a URL with non-ASCII characters:

// javascript
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { ConsoleSpanExporter } = require("@opentelemetry/sdk-trace-node");

// Initialize OpenTelemetry
const sdk = new NodeSDK({
    traceExporter: new ConsoleSpanExporter(),
    instrumentations: [HttpInstrumentation],
});

// Start the SDK
sdk.start().then(() => {
    // Define a URL with non-ASCII characters
    const url = 'http://example.com/привет'; // Replace with your desired URL

    // Make an HTTP request
    fetch(url)
        .then(response => response.text())
        .then(body => {
            console.log('Response received:', body);
        })
        .catch(err => {
            console.error('Error occurred:', err);
        })
        .finally(() => {
            // Shutdown the SDK
            sdk.shutdown().then(() => {
                console.log('SDK shut down');
            });
        });
});

Explanation:

  1. Dependencies: The script imports necessary modules from OpenTelemetry and node-fetch for making HTTP requests.
  2. SDK Initialization: The OpenTelemetry SDK is initialized with an HTTP instrumentation that allows tracing of HTTP requests.
  3. Non-ASCII URL: A URL is defined that includes non-ASCII characters (in this case, "привет").
  4. HTTP Request: The script makes a fetch request to the specified URL and logs the response or any errors.
  5. Shutdown: Finally, it shuts down the OpenTelemetry SDK after the request completes.

Step 3: Run the Script

You can run this script using Node.js:

bash
node your-script-name.js

Make sure to replace your-script-name.js with the actual name of your script file. This will execute the HTTP request to the URL with non-ASCII characters and log the response.

@gcampax
Copy link
Author

gcampax commented Oct 16, 2024

Hi thanks for creating this reproducer, did you run it? Do you see the issue?

@olabodeIdowu
Copy link

No. try and run the code snippet in your editor and let me know your reply or rather share your code where you have this issues so I can pinpoint what you might be doing wrong.

@gcampax
Copy link
Author

gcampax commented Oct 16, 2024

Your script is buggy, but this one works:

// javascript
const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { ConsoleSpanExporter } = require("@opentelemetry/sdk-trace-node");

// Initialize OpenTelemetry
const sdk = new NodeSDK({
    traceExporter: new ConsoleSpanExporter(),
    instrumentations: [new HttpInstrumentation],
});

// Start the SDK
sdk.start();

// Define a URL with non-ASCII characters
const url = 'https://example.com/привет'; // Replace with your desired URL

// Make an HTTP request
const https = require('https');
https.get(url)

Run it like this and you get

/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:414
                        throw error;
                        ^

TypeError [ERR_UNESCAPED_CHARACTERS]: Request path contains unescaped characters
    at new ClientRequest (node:_http_client:184:13)
    at request (node:https:381:10)
    at /home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:410:94
    at safeExecuteInTheMiddle (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation/build/src/utils.js:28:18)
    at /home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:410:78
    at AsyncLocalStorage.run (node:internal/async_local_storage/async_hooks:91:14)
    at AsyncLocalStorageContextManager.with (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/context-async-hooks/build/src/AsyncLocalStorageContextManager.js:33:40)
    at ContextAPI.with (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/api/build/src/api/context.js:60:46)
    at outgoingRequest (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:401:38)
    at httpsOutgoingRequest (/home/gcamp/Projects/opentelemetry-repro/node_modules/@opentelemetry/instrumentation-http/build/src/http.js:153:93) {
  code: 'ERR_UNESCAPED_CHARACTERS'
}

Node.js v22.9.0

Comment out all the OpenTelemetry init code / module loading and the request succeeds.

As I mentioned in the original comment, there is a clear bug in the OpenTelemetry code, where the wrong URL parsing function is used (a bug that could have security implications, given the documentation for url.parse()). Unless you disagree with that, I think this should be enough details to triage and resolve the issue.

@pichlermarc pichlermarc added priority:p1 Bugs which cause problems in end-user applications such as crashes, data inconsistencies, etc pkg:instrumentation-http has:reproducer This bug/feature has a minimal reproducer provided and removed triage labels Oct 16, 2024
@pichlermarc
Copy link
Member

Thanks for the detailed report and the reproducer @gcampax - I'll try to find someone to work on this ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working has:reproducer This bug/feature has a minimal reproducer provided pkg:instrumentation-http priority:p1 Bugs which cause problems in end-user applications such as crashes, data inconsistencies, etc
Projects
None yet
4 participants