Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(inputs.whois): Add plugin #16509

Open
wants to merge 33 commits into
base: master
Choose a base branch
from

Conversation

paulojmdias
Copy link
Contributor

Summary

This PR adds the WHOIS input plugin to Telegraf, allowing users to collect WHOIS data for configured domains. It retrieves expiration timestamps, registrar details, and domain status. The implementation leverages likexian/whois and likexian/whois-parser for efficient lookups and parsing. Tests and documentation are included.

Checklist

  • No AI generated code was used in this PR

Related issues

resolves #7893

@telegraf-tiger telegraf-tiger bot added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins labels Feb 12, 2025
Signed-off-by: Paulo Dias <[email protected]>
Copy link
Contributor

@Hipska Hipska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this useful addition.

Signed-off-by: Paulo Dias <[email protected]>

Remove file

Signed-off-by: Paulo Dias <[email protected]>
Signed-off-by: Paulo Dias <[email protected]>

Remove file

Signed-off-by: Paulo Dias <[email protected]>
Copy link
Contributor

@Hipska Hipska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, some more comments..

Signed-off-by: Paulo Dias <[email protected]>
Signed-off-by: Paulo Dias <[email protected]>
Signed-off-by: Paulo Dias <[email protected]>
Signed-off-by: Paulo Dias <[email protected]>
Signed-off-by: Paulo Dias <[email protected]>
Comment on lines 166 to 168
plugin.whoisLookup = func(_ *whois.Client, _ string, _ string) (string, error) {
return "", errors.New("whois lookup failed")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the lookup fail in this scenario? Or just return an empty result? (No domain info found for %s)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, what happens in real life when doing this to a real WHOIS server?

In what scenario does it return an empty Domain object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least on the library, if the lookup fails, the library will throw an error.

The Domain will return an empty object when whoisparser.Parse fails.

Copy link
Contributor

@Hipska Hipska Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then the returned err will not be null, so the code will never reach that scenario?

It seems I was mislead by the error message "No domain info found for %s", so I start to think the plugin should return a metric with status_code 0 when the whoisLookup fails..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, forget what I said. After looking into the whois parser code. It looks like additional processing of the returned error type needs to be done.

So no metric on whoisLookup error, but metric on all types of whoisParse errors except for ErrDomainDataInvalid and maybe ErrDomainLimitExceed depending on how you see it..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the support for additional status_code when an error on failed parsing occurs.

I only ignored ErrDomainDataInvalid since I feel useful to keep ErrDomainLimitExceed for the users to adjust the interval config if needed.

Copy link
Contributor

@Hipska Hipska Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, still not sure about ErrDomainLimitExceed. IMHO it should be an error message in the log and not report a metric, since it is not a state of the domain. The user will see it has no data for this specific metric, may check the logs and then adjust config.

TLDR; I don't think status code 10 is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about that, but I pushed the changes for it. Will throw an error message if the parsing error is ErrDomainDataInvalid or ErrDomainLimitExceed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can see what the maintainers think of it..

Copy link
Contributor

@Hipska Hipska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're almost there..

Comment on lines 166 to 168
plugin.whoisLookup = func(_ *whois.Client, _ string, _ string) (string, error) {
return "", errors.New("whois lookup failed")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, what happens in real life when doing this to a real WHOIS server?

In what scenario does it return an empty Domain object?

updated := parsedWhois.Domain.UpdatedDateInTime
if updated == nil {
w.Log.Warnf("No updated date found for %s", domain)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that now they will be emitted as field with value 0 (meaning January 1, 1970)

Copy link
Contributor

@Hipska Hipska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress!

Copy link
Contributor

@Hipska Hipska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for being patient! Good work.

@paulojmdias
Copy link
Contributor Author

Thanks for being patient! Good work.

Thanks for helping making this perfect 🙌

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing this nice plugin @paulojmdias! I do have some comments in the code. One general thing I would love to see is to mock a WHOIS server in the tests so we can get rid of the function injection. I think the whois lib's unit-test might give a good starting point...

Comment on lines 175 to 184
if w.whoisLookup == nil {
w.whoisLookup = func(client *whois.Client, domain, server string) (string, error) {
return client.Whois(domain, server)
}
}
if w.parseWhoisData == nil {
w.parseWhoisData = func(raw string) (whoisparser.WhoisInfo, error) {
return whoisparser.Parse(raw)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really appreciate if we can get rid of this indirection that is solely there for testing. Can we rather mock a WHOIS server response in the tests? In any case this could be

Suggested change
if w.whoisLookup == nil {
w.whoisLookup = func(client *whois.Client, domain, server string) (string, error) {
return client.Whois(domain, server)
}
}
if w.parseWhoisData == nil {
w.parseWhoisData = func(raw string) (whoisparser.WhoisInfo, error) {
return whoisparser.Parse(raw)
}
}
if w.whoisLookup == nil {
w.whoisLookup = func(client *whois.Client, domain, server string) (string, error) {
return client.Whois(domain, server)
}
}
if w.parseWhoisData == nil {
w.parseWhoisData = whoisparser.Parse
}

Copy link
Contributor Author

@paulojmdias paulojmdias Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the code function overwrite in the tests also. Please double-check for now.

However, you want to have some tests like this for this plugin? With some custom files on this side?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. We do have some tests (e.g. here) with some "test cases" being a self-contained directly each and some testing mock to send the test-case data to the plugin...

Comment on lines 149 to 160
// Validate `expiration_timestamp` field (2025-08-13T04:00:00Z → Unix)
expectedExpiration := int64(1755057600)
expirationValue, found := acc.Int64Field("whois", "expiration_timestamp")
require.True(t, found, "expiration_timestamp field missing")
require.InDelta(t, expectedExpiration, expirationValue, 10)

// Validate `expiry` field
now := time.Now()
expectedExpiry := int(expectedExpiration - now.Unix())
expiryValue, found := acc.IntField("whois", "expiry")
require.True(t, found, "expiry field missing")
require.InDelta(t, expectedExpiry, expiryValue, 10) // Allow small delta due to execution time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calls for flaky tests as sometimes the CI infrastructure has a really high load and will likely exceed your delta. As you mock the response, how about putting an exact time in there we can check for?

Copy link
Contributor Author

@paulojmdias paulojmdias Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't disagree, however, the expiry will always be dynamic, since is computed based on the current timestamp. What did you suggest here?

I don't think makes sense to do something like the below since will also fail in make lint tests. Maybe remove the required value validation from expiry?

require.Equal(t, expiryValue, expiryValue)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ignore the fields that are computed in the comparison using the testutil.IgnoreFields option and then afterwards just check it is require.Positive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srebhan i pushed using the same approach as x509_cert which uses testutil.IgnoreFields("expiry"). IdK if you want to keep the same approach or not, but please let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srebhan I only saw your last comment after my push. Please review my changes again 🙏

@srebhan srebhan changed the title feat(inputs.whois): add input plugin for whois data feat(inputs.whois): Add plugin Feb 17, 2025
@srebhan srebhan self-assigned this Feb 17, 2025
Comment on lines +191 to +193
return &Whois{
Timeout: config.Duration(5 * time.Second),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the default value for Server

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads-up. This makes me find the server sent to the Client is not working with the same behavior as being empty. I have different outputs if I send whois.iana.org as a server or if I send nothing to the client.

Output setting server as empty:

whois,domain=influxdata-test.com status_code=6i 1739542332000000000

Output setting server as the supposed default whois.iana.org:

> whois,domain=influxdata-test.com creation_timestamp=473385600i,dnssec_enabled=false,expiration_timestamp=0i,expiry=0i,name_servers="a.gtld-servers.net,b.gtld-servers.net,c.gtld-servers.net,d.gtld-servers.net,e.gtld-servers.net,f.gtld-servers.net,g.gtld-servers.net,h.gtld-servers.net,i.gtld-servers.net,j.gtld-servers.net,k.gtld-servers.net,l.gtld-servers.net,m.gtld-servers.net",registrant="",registrar="",status_code=5i,updated_timestamp=1701907200i 1739809813000000000

Technically, what is happening is (like whois CLI), when we send a server, the whois won't follow redirection if it is not explicitly defined, which leads to only parsing the data from the TLD, which is this case is .com

     -Q      Do a quick lookup; whois will not attempt to follow referrals to other whois servers.  This is the default if a server is explicitly specified using one of the other options or in an environment variable.  See also the -R option.

Result with the default behavior:

❯ whois -h whois.iana.org influxdata-test.com
% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object

refer:        whois.verisign-grs.com

domain:       COM

organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston VA 20190
address:      United States of America (the)

contact:      administrative
name:         Registry Customer Service
organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston VA 20190
address:      United States of America (the)
phone:        +1 703 925-6999
fax-no:       +1 703 948 3978
e-mail:       [email protected]

contact:      technical
name:         Registry Customer Service
organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston VA 20190
address:      United States of America (the)
phone:        +1 703 925-6999
fax-no:       +1 703 948 3978
e-mail:       [email protected]

nserver:      A.GTLD-SERVERS.NET 192.5.6.30 2001:503:a83e:0:0:0:2:30
nserver:      B.GTLD-SERVERS.NET 192.33.14.30 2001:503:231d:0:0:0:2:30
nserver:      C.GTLD-SERVERS.NET 192.26.92.30 2001:503:83eb:0:0:0:0:30
nserver:      D.GTLD-SERVERS.NET 192.31.80.30 2001:500:856e:0:0:0:0:30
nserver:      E.GTLD-SERVERS.NET 192.12.94.30 2001:502:1ca1:0:0:0:0:30
nserver:      F.GTLD-SERVERS.NET 192.35.51.30 2001:503:d414:0:0:0:0:30
nserver:      G.GTLD-SERVERS.NET 192.42.93.30 2001:503:eea3:0:0:0:0:30
nserver:      H.GTLD-SERVERS.NET 192.54.112.30 2001:502:8cc:0:0:0:0:30
nserver:      I.GTLD-SERVERS.NET 192.43.172.30 2001:503:39c1:0:0:0:0:30
nserver:      J.GTLD-SERVERS.NET 192.48.79.30 2001:502:7094:0:0:0:0:30
nserver:      K.GTLD-SERVERS.NET 192.52.178.30 2001:503:d2d:0:0:0:0:30
nserver:      L.GTLD-SERVERS.NET 192.41.162.30 2001:500:d937:0:0:0:0:30
nserver:      M.GTLD-SERVERS.NET 192.55.83.30 2001:501:b1f9:0:0:0:0:30
ds-rdata:     19718 13 2 8acbb0cd28f41250a80a491389424d341522d946b0da0c0291f2d3d771d7805a

whois:        whois.verisign-grs.com

status:       ACTIVE
remarks:      Registration information: http://www.verisigninc.com

created:      1985-01-01
changed:      2023-12-07
source:       IANA

Results explicitly saying to follow redirects when the server is set:

❯ whois -h whois.iana.org -R influxdata-test.com
% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object

refer:        whois.verisign-grs.com

domain:       COM

organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston VA 20190
address:      United States of America (the)

contact:      administrative
name:         Registry Customer Service
organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston VA 20190
address:      United States of America (the)
phone:        +1 703 925-6999
fax-no:       +1 703 948 3978
e-mail:       [email protected]

contact:      technical
name:         Registry Customer Service
organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston VA 20190
address:      United States of America (the)
phone:        +1 703 925-6999
fax-no:       +1 703 948 3978
e-mail:       [email protected]

nserver:      A.GTLD-SERVERS.NET 192.5.6.30 2001:503:a83e:0:0:0:2:30
nserver:      B.GTLD-SERVERS.NET 192.33.14.30 2001:503:231d:0:0:0:2:30
nserver:      C.GTLD-SERVERS.NET 192.26.92.30 2001:503:83eb:0:0:0:0:30
nserver:      D.GTLD-SERVERS.NET 192.31.80.30 2001:500:856e:0:0:0:0:30
nserver:      E.GTLD-SERVERS.NET 192.12.94.30 2001:502:1ca1:0:0:0:0:30
nserver:      F.GTLD-SERVERS.NET 192.35.51.30 2001:503:d414:0:0:0:0:30
nserver:      G.GTLD-SERVERS.NET 192.42.93.30 2001:503:eea3:0:0:0:0:30
nserver:      H.GTLD-SERVERS.NET 192.54.112.30 2001:502:8cc:0:0:0:0:30
nserver:      I.GTLD-SERVERS.NET 192.43.172.30 2001:503:39c1:0:0:0:0:30
nserver:      J.GTLD-SERVERS.NET 192.48.79.30 2001:502:7094:0:0:0:0:30
nserver:      K.GTLD-SERVERS.NET 192.52.178.30 2001:503:d2d:0:0:0:0:30
nserver:      L.GTLD-SERVERS.NET 192.41.162.30 2001:500:d937:0:0:0:0:30
nserver:      M.GTLD-SERVERS.NET 192.55.83.30 2001:501:b1f9:0:0:0:0:30
ds-rdata:     19718 13 2 8acbb0cd28f41250a80a491389424d341522d946b0da0c0291f2d3d771d7805a

whois:        whois.verisign-grs.com

status:       ACTIVE
remarks:      Registration information: http://www.verisigninc.com

created:      1985-01-01
changed:      2023-12-07
source:       IANA

# whois.verisign-grs.com

No match for domain "INFLUXDATA-TEST.COM".
>>> Last update of whois database: 2025-02-17T18:34:30Z <<<

So, I do not understand why the library is not respecting the disableReferral, which by default is false, but either way, it seems the return comes from here.

Due to these concerns, does it make sense to support custom whois servers, or should we open an issue and let that happen for the next interaction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue is on this line. When a server is provided, in the end, they return an answer from all the steps. When is not provided, the library will get the server at this point, query directly the server here and if there is not any additional whois referral server, will return here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is best to report that issue upstream, since they are providing a way to set a custom server, it should be supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to my comments...

Comment on lines +8 to +12
⭐ Telegraf v1.34.0

🏷️ network, web

💻 all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not introduce empty lines between those to keep the same format as for other plugins!

Suggested change
⭐ Telegraf v1.34.0
🏷️ network, web
💻 all
⭐ Telegraf v1.34.0
🏷️ network, web
💻 all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

This will make appear everything on the same line, shouldn't be more readable to keep it in different lines?


// Ensure timeout is valid
if w.Timeout <= 0 {
return fmt.Errorf("%v is an invalid timeout value", w.Timeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("%v is an invalid timeout value", w.Timeout)
return errors.New("negative timeouts are not supported")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition is less or equal 0. So, not only negative timeouts.

Maybe something like negative timeouts or 0 are not supported?

Comment on lines 149 to 160
// Validate `expiration_timestamp` field (2025-08-13T04:00:00Z → Unix)
expectedExpiration := int64(1755057600)
expirationValue, found := acc.Int64Field("whois", "expiration_timestamp")
require.True(t, found, "expiration_timestamp field missing")
require.InDelta(t, expectedExpiration, expirationValue, 10)

// Validate `expiry` field
now := time.Now()
expectedExpiry := int(expectedExpiration - now.Unix())
expiryValue, found := acc.IntField("whois", "expiry")
require.True(t, found, "expiry field missing")
require.InDelta(t, expectedExpiry, expiryValue, 10) // Allow small delta due to execution time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ignore the fields that are computed in the comparison using the testutil.IgnoreFields option and then afterwards just check it is require.Positive?

@telegraf-tiger
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin new plugin plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: whois telegraf input plugin
3 participants