Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easier way to check if a domain includes a valid TLD #27

Open
axelson opened this issue Feb 3, 2018 · 7 comments
Open

Easier way to check if a domain includes a valid TLD #27

axelson opened this issue Feb 3, 2018 · 7 comments

Comments

@axelson
Copy link
Contributor

axelson commented Feb 3, 2018

I have the given requirement, given an arbitrary string, check if any valid domains are listed in the string and highlight/linkify them. I already have a basic regex to split the string into potential domains.

I would like there to be a function in PublicSuffix that could more readily answer this. Right now I am using:

  @doc """
  Check if the given host contains a valid TLD that does not comprise the entire
  host (since we don't want to create a link to a TLD)
  """
  def valid_tld?(%URI{host: nil}), do: false
  def valid_tld?(%URI{host: host}) do
    rule = PublicSuffix.prevailing_rule(host)
    rule != "*" && rule != host
  end

Which I guess seems simple enough but it might be nice to have some sort of helper.

@axelson
Copy link
Contributor Author

axelson commented Feb 7, 2018

Would a PR for this functionality be welcomed?

@bkirz
Copy link
Contributor

bkirz commented Feb 8, 2018

Before we add new functionality to this library, would registrable_domain be closer to what you want?

def valid_tld?(%URI{host: nil}), do: false
def valid_tld?(%URI{host: host}) do
  PublicSuffix.registrable_domain(host) != nil
end

@axelson
Copy link
Contributor Author

axelson commented Feb 8, 2018

registrable_domain/2 doesn't work for me because it returns values for invalid TLD's. For example it doesn't return nil for:

iex> PublicSuffix.registrable_domain("a.blahzzzzz")
"a.blahzzzzz"

Even though "blahzzzzz" is not a valid TLD. Although perhaps that's just a bug in registerable_domain/2?

@bkirz
Copy link
Contributor

bkirz commented Feb 8, 2018

Interesting; it looks like we're falling back to the default rule in that case. This is correct according to the spec, but it makes the library substantially less useful. I like the public_suffix rubygem's approach of including a flag to disable the default rule:

irb(main):001:0> require 'public_suffix'
=> true
irb(main):002:0> PublicSuffix.valid?("a.blahzzzzz")
=> true
irb(main):003:0> PublicSuffix.valid?("a.blahzzzzz", default_rule: nil)
=> false

I'm thinking adding a similar option like PublicSuffix.registrable_domain("a.blahzzzzz", use_default_rule: false) to existing functions would make them better suited to your use case. How's that sound?

@axelson
Copy link
Contributor Author

axelson commented Feb 9, 2018

That would work great for me!

@andersju
Copy link
Contributor

I know this issue is old, but since it's still open, I figure it might help someone else: sounds like the already existing matches_explicit_rule?/1 could help? E.g.:

iex(18)> PublicSuffix.matches_explicit_rule?("a.com")     
true
iex(19)> PublicSuffix.matches_explicit_rule?("a.blahzzzzz")                      
false

Of course simply "com" will also be true, but you could combine it with registrable_domain/1. Having a use_default_rule option would be handy though.

@axelson
Copy link
Contributor Author

axelson commented Sep 11, 2018

This is the function that we're currently using and it's been serving us well:

  def valid_tld?(%URI{host: host}) do
    # The checking is a little convoluted
    # Would be improved by https://github.com/seomoz/publicsuffix-elixir/issues/27
    rule = PublicSuffix.prevailing_rule(host)
    rule != "*" && rule != host
  end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants