improving url parsing: make it more strict #751

sax · 2025-02-11T18:20:56Z

I have a number of projects that require hackney via transient dependencies. This morning my security audits started flagging hackney with a low severity error for CVE-2025-1211.

I don't see an existing issue for this, so thought I would bring it to your attention here.

I discovered this through the mix_audit library, which I run as a part of my CI pipelines.

The text was updated successfully, but these errors were encountered:

justincy · 2025-02-11T19:00:22Z

A temporary work around:

mix deps.audit --ignore-advisory-ids GHSA-vq52-99r9-h5pw

oscar-ospina · 2025-02-11T21:28:31Z

Wondering... do you guys have plans for patching this?

benoitc · 2025-02-11T22:13:35Z

wondering how one could open a CVE about "possible" security issue without even contacting the author first. I am tired of this behavior.

Hackney doesn't giive 100% warranty a URL is syntactically correct. This is more a user and conveniencechoice . In the example linked above the host will be "127.0.0.2 " since all before @ will be shown as an auth string . The only "fix" i can see is to add an option to not use that feature and only ensure host and ports are passed there.

Trevoke · 2025-02-11T22:43:19Z

It looks like this is sample code that showcases the vulnerability: https://gist.github.com/snoopysecurity/996de09ec0cfd0ebdcfdda8ff515deb1

kevinkirkup · 2025-02-11T23:09:25Z

That gist has been open for 11 months.
Thanks snyk.io!

Trevoke · 2025-02-11T23:21:55Z

@kevinkirkup It has. I'm afraid I don't understand what value this information adds to the conversation, can you help me see it?

benoitc · 2025-02-11T23:23:39Z

i will send a pr tomorrow to make url parsing stricter. I am debating it should be the default there though. But since mix.audit already flag it, there maybe no choice to make the default strict anyway. any feedback is welcome.

sax · 2025-02-12T00:08:40Z

Yeah, it must be very frustrating that this was known about for that long and a CVE was issued without (it seems) any attempt to communicate with you. Thank you for the effort to resolve it!

With regards to the default implementation... I'm struggling to find the RFC(s) that covers the use of @ in basic auth passwords, especially regarding using them in URIs. This Stack Overflow answer implies that @ characters are only valid in quoted strings, but while it links to specific sections of RFC2616 I don't personally see how the linked sections tie back to URI parsing.

My personal opinion is that stricter would be preferred. If someone wants to use basic or digest auth with an @ in the password, they could either not, or they could use an Authorization request header instead.

warmwaffles · 2025-02-12T00:23:01Z

they could either not

This is the only sane method.

sax · 2025-02-12T00:24:22Z

Ok, from my reading of RFC3986 section 3.2, an @ is not valid in the userinfo part of a URI. It would need to be encoded as %40.

authority     = [ userinfo "@" ] host [ ":" port ]
userinfo    = *( unreserved / pct-encoded / sub-delims / ":" )

unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims     =       "$" / "&" / "'" /                                     "=" / other-delims
other-delims   = "!" /                   "(" / ")" / "*" / "+" / "," / ";"
pct-encoded   = "%" HEXDIG HEXDIG

warmwaffles · 2025-02-12T01:17:14Z

This won't mean much here but in Elixir it is parsed out as

%URI{
  scheme: "http",
  authority: "127.0.0.1",
  userinfo: nil,
  host: "127.0.0.1",
  port: 80,
  path: nil,
  query: "@127.2.2.2/",
  fragment: nil
} = URI.parse("http://[email protected]/")

The host is right, the request is supposed to go to 127.0.0.1

Using uri_string seems to be a good move https://www.erlang.org/doc/apps/stdlib/uri_string but I don't know which OTP version it was introduced in 😞

EDIT: OTP 21 is when uri_string was added. Probably safe to use here.

benoitc · 2025-02-12T08:42:57Z

So the reason we parse the user auth from the uri is because hackney is an http client. Just like curl or your browser does it , it's useful for the user to have the same convenience. Since. It's an HTTP client and as such is not only used for the use case of a server to connect to an api. There is nothing I see really wrong in parsing the auth from the http line there per see. Removing this convenience as an http client lib will break a lot of code around.

Also here I fail to see how it's a security issue, if 127.0.0.2 can be used, what stop the "attacker" to just pass this host? That would mean that no check is done on the uri at first? User can already fix this "security" issue by parsing the uri (using the hackney_url module) they receive or store and ensure that user part is stripped. Then remove it and check if the host is accepted. How lazy should be the validation?

That said, this convenience will be kept around but will now be enabled using a global option. The patch add support of url validation (strict mode enabled) . This will land during the day.

jewalky · 2025-02-12T12:24:44Z

So the reason we parse the user auth from the uri is because hackney is an http client. Just like curl or your browser does it , it's useful for the user to have the same convenience.

Worth noting though that both curl and available browsers (Firefox, Chrome) parse the URL http://[email protected]/ as going to 127.0.0.1 — while they still support being able to provide a username and password. So it's not entirely clear what's wrong with both keeping the convenience and fixing the issue, and why does it have to be mutually exclusive to either support @ or address the CVE 🤔

User can already fix this "security" issue by parsing the uri (using the hackney_url module)

Considering Hackney is the backend for one of the most popular HTTP libraries for Elixir (HTTPoison), most developers who are using it are not even aware of hackney_url or Hackney's direct APIs. While ignorance isn't generally a good thing, I wouldn't call people lazy if they don't dig deep into the dependencies and internals of each library they are using. Especially if it relates to something as common as URL parsing.

I recently had a similar obscure problem in my (much smaller of course) project, where I needed to filter out certain protocols. While investigating a report, I learned that the URL standard or it's common implementation allows having a \n inside the protocol name, where it will be silently removed by browsers and most URL parsing libraries, so java\nscript:alert(1) is perfectly valid (and executable!) URL.

Point being that it's actually critical that all libraries do URL parsing exactly the same way, not deviating from each other, across all programming languages and frameworks — and the less bespoke code, the better — otherwise no one knows at which point it will break and for which case, due to URLs being such a fundamental concept.

(A bit unrelated, btw, but hackney_url does preserve the \n in protocol as-is, Elixir's URI just breaks completely and does not recognize it as URL, but, for example, Python's urllib, NodeJS' new URL() and Rust's url do parse it as expected; and all of these except hackney_url will parse the URL from this topic as having 127.0.0.1 as host)

Xunjin · 2025-02-12T14:22:06Z

Well I understand your frustration @benoitc that was indeed a shitty move by the CVE report, however @jewalky point is correct in my POV specifically this part:

Considering Hackney is the backend for one of the most popular HTTP libraries for Elixir (HTTPoison), most developers who are using it are not even aware of hackney_url or Hackney's direct APIs. While ignorance isn't generally a good thing, I wouldn't call people lazy if they don't dig deep into the dependencies and internals of each library they are using. Especially if it relates to something as common as URL parsing.

As this is the common use today shouldn't this fix which address be implemented?

Trevoke · 2025-02-12T16:12:13Z

@Xunjin Unless the author of HTTPoison specifically had a conversation with the author of hackney, I think the amount of responsibility that trickles down to hackney from HTTPoison is limited : HTTPoison could also stop using hackney (and yes, I realize how much work I am suggesting).

@jewalky to this point:

Point being that it's actually critical that all libraries do URL parsing exactly the same way, not deviating from each other, across all programming languages and frameworks — and the less bespoke code, the better — otherwise no one knows at which point it will break and for which case, due to URLs being such a fundamental concept.

I think this is what you meant, but I would correct this to say [...] that all libraries implement the URL spec fully and correctly because I'd rather have one correct library and a thousand wrong ones than a thousand and one wrong ones :)

kevinkirkup · 2025-02-12T16:16:54Z

@kevinkirkup It has. I'm afraid I don't understand what value this information adds to the conversation, can you help me see it?

Sorry, wondering if this is really an issue since it's been know for 11 months.

sax · 2025-02-12T16:20:46Z

So the reason we parse the user auth from the uri is because hackney is an http client.

I don't think anyone is suggesting removing the feature to parse userinfo from URIs, or even making the feature optional (though according to multiple specifications it should be deprecated).

From the RFCs that I read yesterday, though, an unescaped @ character is invalid in the userinfo part of a URI. I would expect HTTP clients to follow the same specs for parsing URIs.

According to the README, hackney supports Erlang 22.3 and later. Since uri_string was added in OTP 21, if hackney_url delegated to that, I wonder if it would solve the issue as well as reduce future maintenance for you.

Trevoke · 2025-02-12T16:58:15Z

@kevinkirkup It has. I'm afraid I don't understand what value this information adds to the conversation, can you help me see it?

Sorry, wondering if this is really an issue since it's been know for 11 months.

Ah! Thank you for clarifying, I did not understand that from your message. I think the question of whether it's really an issue is a really fair point but unfortunately this is not the place for the conversation - you should send that comment upstream to whoever decided to open this advisory - so maybe one of these two places: https://nvd.nist.gov/vuln/detail/CVE-2025-1211 or https://www.cve.org/CVERecord?id=CVE-2025-1211

[ Edit: in addition, something being known for a while is not the same thing as something not being an issue :) ]

peaceful-james · 2025-02-12T18:55:18Z

Sending good vibes to you @benoitc

Thanks for allowing an entire community of developers to stand on your shoulders. 🥇

peymor19 · 2025-02-14T22:42:47Z

(A bit unrelated, btw, but hackney_url does preserve the \n in protocol as-is, Elixir's URI just breaks completely and does not recognize it as URL, but, for example, Python's urllib, NodeJS' new URL() and Rust's url do parse it as expected; and all of these except hackney_url will parse the URL from this topic as having 127.0.0.1 as host)

To add to this, I did find in RFC 2396 that US-ASCII control characters are not to be allowed in the URI syntax https://datatracker.ietf.org/doc/html/rfc2396#section-2.4.3, but I did not find anywhere in the new RFC 3986 that mentions anything about control characters in the URI.

Maybe someone smarter than me might be able to comment on this find.

sax · 2025-02-15T14:33:08Z

The ABNF rules look to me like they do not include control characters, unless I'm missing something or there's a superseding RFC I can't find.

peymor19 · 2025-02-17T18:23:04Z

The ABNF rules look to me like they do not include control characters, unless I'm missing something or there's a superseding RFC I can't find.

Yep, you are right. To me it looks like if you pass a control character you should error but if you pass a percent-encoded control character it would be valid according to the ABNF rules. I tested this case on a few different libraries and it seems that some handle this case slightly differently.

benoitc · 2025-02-20T15:17:58Z

I am closing this issue since hackney 1.21.0 has been released and patch hackney_url to fix this vulnerability. Thanks all for the discussion and encouragements. I appreciate.

Full changelog: https://github.com/benoitc/hackney/releases/tag/1.21.0

aselder mentioned this issue Feb 11, 2025

Switch HTTP client from hackney to finch getsentry/sentry-elixir#758

Open

benoitc changed the title ~~CVE-2025-1211 — Server-side Request Forgery~~ improving url parsing Feb 11, 2025

benoitc changed the title ~~improving url parsing~~ improving url parsing: make it more strict Feb 11, 2025

benoitc added working on it high priority labels Feb 12, 2025

benoitc self-assigned this Feb 12, 2025

tank-bohr mentioned this issue Feb 12, 2025

fix(hackney_url): parses query string before userinfo #753

Merged

Trevoke mentioned this issue Feb 12, 2025

Server-side Request Forgery (SSRF) in hackney #752

Closed

benoitc closed this as completed Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improving url parsing: make it more strict #751

improving url parsing: make it more strict #751

sax commented Feb 11, 2025

justincy commented Feb 11, 2025

oscar-ospina commented Feb 11, 2025

benoitc commented Feb 11, 2025 •

edited

Loading

Trevoke commented Feb 11, 2025

kevinkirkup commented Feb 11, 2025

Trevoke commented Feb 11, 2025

benoitc commented Feb 11, 2025

sax commented Feb 12, 2025

warmwaffles commented Feb 12, 2025

sax commented Feb 12, 2025 •

edited

Loading

warmwaffles commented Feb 12, 2025 •

edited

Loading

benoitc commented Feb 12, 2025 •

edited

Loading

jewalky commented Feb 12, 2025

Xunjin commented Feb 12, 2025

Trevoke commented Feb 12, 2025

kevinkirkup commented Feb 12, 2025

sax commented Feb 12, 2025

Trevoke commented Feb 12, 2025 •

edited

Loading

peaceful-james commented Feb 12, 2025

peymor19 commented Feb 14, 2025 •

edited

Loading

sax commented Feb 15, 2025

peymor19 commented Feb 17, 2025

benoitc commented Feb 20, 2025 •

edited

Loading

improving url parsing: make it more strict #751

improving url parsing: make it more strict #751

Comments

sax commented Feb 11, 2025

justincy commented Feb 11, 2025

oscar-ospina commented Feb 11, 2025

benoitc commented Feb 11, 2025 • edited Loading

Trevoke commented Feb 11, 2025

kevinkirkup commented Feb 11, 2025

Trevoke commented Feb 11, 2025

benoitc commented Feb 11, 2025

sax commented Feb 12, 2025

warmwaffles commented Feb 12, 2025

sax commented Feb 12, 2025 • edited Loading

warmwaffles commented Feb 12, 2025 • edited Loading

benoitc commented Feb 12, 2025 • edited Loading

jewalky commented Feb 12, 2025

Xunjin commented Feb 12, 2025

Trevoke commented Feb 12, 2025

kevinkirkup commented Feb 12, 2025

sax commented Feb 12, 2025

Trevoke commented Feb 12, 2025 • edited Loading

peaceful-james commented Feb 12, 2025

peymor19 commented Feb 14, 2025 • edited Loading

sax commented Feb 15, 2025

peymor19 commented Feb 17, 2025

benoitc commented Feb 20, 2025 • edited Loading

benoitc commented Feb 11, 2025 •

edited

Loading

sax commented Feb 12, 2025 •

edited

Loading

warmwaffles commented Feb 12, 2025 •

edited

Loading

benoitc commented Feb 12, 2025 •

edited

Loading

Trevoke commented Feb 12, 2025 •

edited

Loading

peymor19 commented Feb 14, 2025 •

edited

Loading

benoitc commented Feb 20, 2025 •

edited

Loading