-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URL with multiple consecutive dots should not be valid? #25
Comments
http://stackoverflow.com/questions/27142359/is-a-url-with-multiple-consecutive-dots-valid Related: http://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid |
Just wanted to check what the status was since 2.0.0, still the same: irb(main):002:0> Twingly::URL.parse("www..hej..se").valid?
=> true |
The regular one year check: $ pry
[1] pry(main)> require "twingly/url"
=> true
[2] pry(main)> Twingly::URL.parse("www..hej..se")
=> #<Twingly::URL:0x3fe1840b4dac http://www..hej..se>
[3] pry(main)> Twingly::URL.parse("www..hej..se").valid?
=> true
[4] pry(main)> Twingly::URL::VERSION
=> "5.0.1" |
Someone reported this upstream: weppos/publicsuffix-ruby#150 |
$ pry -rtwingly/url
[1] pry(main)> Twingly::URL.parse("www..hej..se")
=> #<Twingly::URL:0x3fc97ae4e958 http://www..hej..se>
[2] pry(main)> Twingly::URL.parse("www..hej..se").valid?
=> true
[3] pry(main)> Twingly::URL::VERSION
=> "5.1.1" |
[9] pry(main)> (url.methods - Object.methods - [:between?, :clamp]).each { |method| puts "#{method}: #{url.public_send(method).inspect}" }
path: ""
host: "www..hej..se"
password: ""
valid?: true
scheme: "http"
userinfo: ""
user: ""
domain: "hej.se"
origin: "http://www..hej..se"
normalized: #<Twingly::URL:0x3fe6b8d3a804 http://www..hej..se/>
sld: "hej"
trd: "www."
tld: "se"
ttld: "se"
without_scheme: "//www..hej..se"
normalized_scheme: "http"
normalized_host: "www..hej..se"
normalized_path: "/" |
I am the one who raised the issue there.. |
I'm not sure what the best response is to the Instead of: irb(main):003:0> Twingly::URL.parse("www..hej..se").to_s
=> "http://www..hej..se"
irb(main):004:0> Twingly::URL.parse("www..hej..se").normalized.to_s
=> "http://www..hej..se/" It would do this: irb(main):003:0> Twingly::URL.parse("www..hej..se").to_s
=> "http://www..hej..se"
irb(main):004:0> Twingly::URL.parse("www..hej..se").normalized.to_s
=> "http://www.hej.se/" New issue for this? |
@jage Sounds good to me, yeah a new issue for that is the best I think |
@jage Ah no, I don't think we should do that, because of
|
I think the goal for twingly-url normalize is to do the same as our .NET normalizer, correct? |
I don't want |
Then I think we have the answer? If we should do the same, it should not be valid (i.e. dropped).
Yes it would be best if it's just "the same" normalisation, but I think this is a change where we could change the .NET normalisation as well. I'm not trying to fix a URL, I'm trying to create a normalized form so we can compare and see if the URL is blocked, already in the system etc. |
Let's keep the discussion of the validity here, and normalization in #125 |
I see. I think not valid makes sense because other tools doesn't treat |
$ ruby -e "require 'twingly/url'; url = Twingly::URL.parse('www..hej..se') ; p Twingly::URL::VERSION, url, url.valid?"
"6.0.2"
#<Twingly::URL:0x3fd0578a13c8 http://www..hej..se>
true |
#158 made these URLs invalid, this issue can be closed! |
It's been a long time. 😄 |
$ dig www..hej..se dig: 'www..hej..se' is not a legal name (empty label)
The text was updated successfully, but these errors were encountered: