-
Notifications
You must be signed in to change notification settings - Fork 959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Path is decoded in createLocation #505
Comments
Alright, so just to be clear, the reasoning behind decoding in // with decoding
<Route path='/역사' />
// without decoding
<Route path='/%EC%97%AD%EC%82%AC' /> Another effect of this is that params will be automatically decoded. <Route path='/greeting/:word' />
// user visits /greeting/안녕하세요
match === {
...,
params: { word: '안녕하세요' }
// without decoding this would be %EC%95%88%EB%85%95%ED%95%98%EC%84%B8%EC%9A%94
} History then relies on the browser to automatically encode the pathname when we push to the browser. This appears to work for all characters except for the percent sign. There are a few thought I have on possible ways to deal with this: Stop all decodingI don't think that this is desirable, but I'll at least throw it out there. We could leave the decoding to the program that is using history's locations. // URI = '/역사'
// window.location.pathname = '/%EC%97%AD%EC%82%AC'
{ pathname: '/%EC%97%AD%EC%82%AC', ...}
history.push('/역사')
{ pathname: '/역사', ...} Add a
|
I'm sorry, I'm not familiar with the code, but is it not possible to |
what is the recommended workaround for this? in components, the route params are only partially decoded, see referenced issue. we have file names that are coming from users, and sometimes they have |
I have the same problem. and history converts this to the url: http://localhost:3000/data/Amazing Example History removes my empty space! But why :) |
I guess I have a similar issue as @janhoeck How can I work around it? ( 2x encodeURIComponent is not really a nice fix...but would kinda help right now :D ) |
Do you know why browsers don't encode the |
@OliverJAsh Browsers use percent-encoding, so if they encoded the
|
Ah. I've noticed browsers also do this when you enter a URL in the address bar, e.g. IIUC: as a user, when entering the URL in the address bar, we know the query params are not encoded, but the computer does not! |
Perhaps one argument in favour of stopping all decoding in this module is for consistency with HTML5's native history. Currently we have this inconsistency: history.push(`/foo/${encodeURIComponent('foo % bar')}`)
// => URL is "/foo/foo%20%%20bar"
// => `history.location.pathname` is `/foo/foo % bar`
window.history.pushState(null, '', `/foo/${encodeURIComponent('foo % bar')}`)
// => URL is "/foo/foo%20%25%20bar"
// => `window.location.pathname` is `/foo/foo%20%25%20bar` Both the generated URL and |
I think my vote is to make our @pshrmn Isn't there some other way we can provide decoded parameters to RR users? Instead of decoding the pathnames in history, can we do it later in the router (i.e. just before we need to match with path-to-regexp)? |
RR could decode in If decoding is removed, there will be an inconsistency in location objects unless |
I haven't thought through the implications, but if you wanted to enforce that pathnames are encoded for consistency, that could be done using an anchor. That would leave encoded ones alone, but ensure anything that should be encoded is. function encode(pathname) {
const a = document.createElement('a');
a.setAttribute('href', pathname);
return a.pathname;
} |
It also appears this doesn't work for <a href="https://google.com/foo & bar"/>go</a> When clicked, the browser navigates to I don't know whether this needs to be considered as part of this or not—just pointing it out. |
@OliverJAsh that is the |
@pshrmn I've updated my example to show the same problem for pathnames. |
@pshrmn In fact it would seem the browser only URI encodes spaces, unlike I presume it's using (My example uses query parameters for easy testing, but the same browser mechanism is used for pathnames.) |
I'm trying to remember exactly what I was talking about in that comment (it has been six months since I wrote it). I believe that what I meant by "this appears to work for all characters except for the percent sign" is that the percent sign is the only character that we cannot safely decode and expect the browser to properly encode. From MDN,
Also from MDN, the following characters (which does not include the percent sign) are not encoded by
history.push('/test&ing');
const decoded = decodeURI('/test&ing'); // /test&ing
history.pushState(null, '', decoded)
// window.location.pathname = '/test&ing'
history.push('/test%26ing');
const decoded = decodeURI('/test%26ing'); // /test%26ing
// window.location.pathname = '/test%26ing'
history.push('/test%25ing'); // %25 = encodeURI('%')
const decoded = decodeURI('/test%25ing'); // /test%ing
// window.location.pathname = '/test%ing' |
@pshrmn That makes sense! IIUC, this issue only concerns the percent symbol and not other characters that will be decoded by However, it is quite surprising to find that some characters in const myLocation = history.createLocation({ pathname: `/search/${encodeURIComponent('foo & bar')}` });
historyInstance.push(myLocation);
myLocation.pathname
// => "/search/foo %26 bar"
window.location.pathname
// => "/search/foo%20%26%20bar" Even if most of the decoded characters are correctly re-encoded by the browser, this is still an issue when working with I think this is also called out in remix-run/react-router#5607. |
For anyone who runs into this issue and is looking for workarounds… On unsplash.com we are working around this by double URI encoding: history.push({
pathname: `randomlocation/${ encodeURIComponent(encodeURIComponent("AB%C")) }/furthermore`
}); This results in However, this has unintended side effects. If a user navigates directly to If we push (The same applies to location params.) To workaround this side effect, we run a "safe decode": const safeDecode = (pathname) => {
try {
return decodeURIComponent(pathname);
} catch (_error) {
return pathname;
}
} I'm sure this workaround has further implications… I'd be very happy to help getting this issue fixed. Is there enough consensus for us to go with one of the proposed solutions? |
@OliverJAsh Double encoding doesn't work because if the user hits the back button then the forward button, the error will throw again. |
@kevinliang What exactly would throw again? I tried what you said and I didn't get any exception. |
Here's how we're getting around the issue in our project:
Per the library's changelog, the encoding / decoding stuff was first added in v4.6.0, and v4.5.1 is the most recent version published without these changes. By making |
For me it seems the I tried to visit an url with reserved characters (
From random googling, people said the decodeURI/encodeURI should be used on full url's, while decodeURIComponent/encodeURIComponent on parts (component) of the url. The pathname is a component. More info here for the difference between them https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI#Description I've wrote a few tests. If you change decodeURI to decodeURIComponent these will pass:
There are 3 (well 1 for 3 histories) tests failing, but I think that would be a false negative:
A problem would arise, when you decode %23 as an hash, and later on would re-interpertate the hash as a valid fragment (if this would be possible). Safest way would be to store the original path somewhere. Edit: I also discovered a second issue, related to this.
This means I sometime have to decode twice, and sometimes once, if I use the double encode method described. My current workaround:
|
Invoking In this thread, workarounds were suggested as to decode everything except % signs. I would argue that it is a bad idea because you're going to surprise some of your indirect users who are going to have a hard time when their RR matcher doesn't work the same way once the % character is part of the URL (i.e. URLs are partially decoded except %25, which means you need to match against %25 rather than %). Even though this represents a breaking change for people who came to rely on this implicit decoding, it is also a mandatory bug fix as applications relying directly or indirectly on this library are broken in regards to the % character and there is no reasonable workaround without fixing the root cause. In conclusion, better figure out a path to make the breaking change at the time of your choosing rather than wait for a browser vendor to force you into it. In the mean time, the main Readme should list this as a known issue. |
We are ditching RR for @reach router because of this massive issue. |
Trying to figure out where this issue now stands and how we can move forward.
Plan:
I've started 1 in #656. |
As a matter of API design, if you want to allow users to provide decoded paths, you could accept a list instead of a string: {
pathname: '/search/a%2Fb%20testing/books'
}
// should be equivalent to something like
{
pathname: ['search', 'a/b testing', 'books']
} The relationship between each form should be approximately:
|
Update on #505 (comment). We've been waiting for a few weeks now for feedback from a maintainer of this repo. No signs yet. :-( In the meantime, I thought I'd share how we're working around this at Unsplash. We use patch-package, which lets us modify the contents of patch-package
--- a/node_modules/history/LocationUtils.js
+++ b/node_modules/history/LocationUtils.js
@@ -44,15 +44,9 @@ var createLocation = exports.createLocation = function createLocation(path, stat
if (state !== undefined && location.state === undefined) location.state = state;
}
- try {
- location.pathname = decodeURI(location.pathname);
- } catch (e) {
- if (e instanceof URIError) {
- throw new URIError('Pathname "' + location.pathname + '" could not be decoded. ' + 'This is likely caused by an invalid percent-encoding.');
- } else {
- throw e;
- }
- }
+ // Waiting for upstream PR to be merged
+ // https://github.com/ReactTraining/history/issues/505#issuecomment-453175833
+ // https://github.com/ReactTraining/history/pull/656
if (key) location.key = key;
--- a/node_modules/history/es/LocationUtils.js
+++ b/node_modules/history/es/LocationUtils.js
@@ -31,16 +31,6 @@ export var createLocation = function createLocation(path, state, key, currentLoc
if (state !== undefined && location.state === undefined) location.state = state;
}
- try {
- location.pathname = decodeURI(location.pathname);
- } catch (e) {
- if (e instanceof URIError) {
- throw new URIError('Pathname "' + location.pathname + '" could not be decoded. ' + 'This is likely caused by an invalid percent-encoding.');
- } else {
- throw e;
- }
- }
-
if (key) location.key = key;
if (currentLocation) {
--- a/node_modules/history/umd/history.js
+++ b/node_modules/history/umd/history.js
@@ -157,16 +157,6 @@ return /******/ (function(modules) { // webpackBootstrap
if (state !== undefined && location.state === undefined) location.state = state;
}
- try {
- location.pathname = decodeURI(location.pathname);
- } catch (e) {
- if (e instanceof URIError) {
- throw new URIError('Pathname "' + location.pathname + '" could not be decoded. ' + 'This is likely caused by an invalid percent-encoding.');
- } else {
- throw e;
- }
- }
-
if (key) location.key = key;
if (currentLocation) { |
This PR covers the Routing + Page Skeleton of the Network IP Details page, including: * New route `app/secops#/network/ip/:ipv4_or_ipv6` * Empty page `ip_details` following convention from `host_details` * Links to IP fields for following components: * Authentications Table * Events Table * Host Summary * Network Top N Flow Table * Timeline Table * HoC `HostDetailsLink` and `IPDetailsLink` for easily linking to the `HostDetails` or `IP Details` pages * Refactored `Breadcrumb` component and navigation logic using react-router's `withRouter` * Added `encodeIpv6` & `decodeIpv6` helper functions (see note below) * Added tests for: * HeaderBreadcrumbsComponent * Custom links `HostDetailsLink` and `IPDetailsLink` * ipv6 helper functions Includes fixes for: * https://github.com/elastic/ingest-dev/issues/310 * https://github.com/elastic/ingest-dev/issues/234 Note: Due to remix-run/history#505 and the way SecOp's react-router is nested within Kibana's native routing (via `link-to`), URL's that contain characters that require URL Encoding (like colons in ipv6) can break the back button. This occurs as react-router adds a history entry for the decoded URL, so when pressing the back button the user returns to the decoded `link-to` url, which in turns re-routes them back to the page they were on. Speaking with @XavierM, we decided an initial solution for this would be to replace the `:`'s in ipv6's with `-`'s. As referenced in the issue above, it looks as if there is potential for a more robust solution we could implement at the `link-to` layer of our application. In the meantime, our current solution maintains the functionality of the back button, and also allows the user to manually input ipv6's with `:`'s into the address bar without breaking navigation. 
There's some discussion in #656 about what the correct behaviour is (as opposed to decoding
What do people here think? Can anyone foresee 2 causing issues? With 2, I'm worried about bugs in the normalization process. This would be off loaded to /cc @pshrmn |
IMHO this is impossible to implement. Let's say I have a path template like
A generic piece of code cannot identify that this is already properly encoded. This is the whole reason why this issue here came up.
The value is called P.S. We are making heavy use of matrix parameters at Instana. Due to this bug we are operating a fork of history for the last 1.5 years which just removes the double encoding. It is working perfectly fine for us. |
@bripkens With this sandbox you can see that it handles fully encoded, non-encoded, and partially encoded strings. |
@bripkens is right when he says it is impossible to implement. In order to correctly encode an URL, you must be able to prove the reverse transformation is always possible without any loss of information. Otherwise you might be causing prejudice to the application that expects to see the same URL that was given to the system originally. Your sandbox demonstrates this impossibility. Three different source URLs are encoded to the same resulting URL. When you decode the final URL, you cannot possibly retrieve the correct source URL all of the time because you cannot guess the intent of the user. This will inevitably lead to subtle bugs that often go unnoticed. One final example to demonstrate how encodeurl will screw up some URLs is shown in this sandbox I derived from yours. Assuming I create an article with the following title : %20 is a space %-encoded, the urlencode library sees it as being partially encoded when really, it is not encoded at all. The resulting encoded string is the following : %20%20is%20a%20space%20%25-encoded which then decodes to ** is a space %-encoded**, corrupting the title of my article by decoding the litteral %20 I originally used. |
@Dalzhim I'll start out with your final example; please let me know if I am misinterpreting your point. Let's start out by completely ignoring the If we use history.pushState(null, "", "/%20 is a space %-encoded")
window.location.pathname === "/%20%20is%20a%20space%20%-encoded" The browser does not encode percentage signs, so the initial decodeURI(window.location.pathname) // this throws an URIError To make the URI valid, we can pass it to an history.pushState(null, "", encodeURI("/%20 is a space %-encoded"))
window.location.pathname === "/%2520%20is%20a%20space%20%25-encoded"
decodeURI(window.location.pathname) // "/%20 is a space %-encoded" CORRECT Now, getting back to history.pushState(null, "", encodeurl("/%20 is a space %-encoded"))
window.location.pathname === "/%20%20is%20a%20space%20%25-encoded"
decodeURI(window.location.pathname) // "/ is a space %-encoded" INCORRECT
history.pushState(null, "", encodeurl(encodeURI("/%20 is a space %-encoded")))
window.location.pathname === "/%2520%20is%20a%20space%20%25-encoded"
decodeURI(window.location.pathname) // "/%20 is a space %-encoded" CORRECT In the first case, we still have the incorrect pathname, but it is no longer an invalid URI because In the second case, we get the same (correct) result as the plain
|
@pshrmn : Your previous message is 100% correct. The user is the only one who has enough information to disambiguate any malformed URL. The first case ended up with a valid, but meaningless URL whereas the second example displays proper construction of the URL by the user, which makes it possible for the library to handle in a generic way. In other words, |
Someone who knows better than me should raise a PR to |
Can I ask before this issue, maybe rr 3 and below, what were the isssues with how it treated path params and what not? I don’t remember tbh having an issue until 4. |
The `history` package automatically tries to decode the URI and will break if the URI has an unencoded `%`. This problem usually props up when users manually manipulate the URL since in-app navigation will properly handle encoding. This change will redirect the user to the current path and strip all query strings. See remix-run/history#505 - They merged a change that would make the `history` package not always attempt to decode the URI, but requires a major version bump (and corresponding updates to react-router). Fixes JAVASCRIPT-7W0
…#13306) The `history` package automatically tries to decode the URI and will break if the URI has an unencoded `%`. This problem usually props up when users manually manipulate the URL since in-app navigation will properly handle encoding. Currently, when this happens, the user is greeted with a big empty page. Instead, this change will redirect the user to the current path and strip all query strings so that we're able to at least load the Sentry app. See remix-run/history#505 - They merged a change that would make the `history` package not always attempt to decode the URI, but requires a major version bump (and corresponding updates to react-router). Fixes JAVASCRIPT-7W0 Fixes JAVASCRIPT-5XJ
This results in an invalid URL being pushed to the HTML5 history element when there are encoded percent signs in the pathname.
E.g.:
results in
"randomlocation/AB%C/futhermore"
being pushed to the HTML5 history.The text was updated successfully, but these errors were encountered: