Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ClickHouse input plugin (instead of #3894) #6441

Merged

Conversation

kshvakov
Copy link

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

This PR replace #3894

@kshvakov kshvakov mentioned this pull request Sep 25, 2019
3 tasks
@danielnelson
Copy link
Contributor

Before I review again, can you comment about the goroutine in the clickhouse library that I mentioned in the top comment of my original review?

@kshvakov
Copy link
Author

@danielnelson Hi, I didn't use the tcp protocol for the new implementation, I used the HTTP protocol and standard net/http library.
PS: ClickHouse driver uses "cached time" because the real time.Now() is very expensive for it
PS2: clickhouse_tables for metric name, I know that this metric is mergetree specific, but ClickHouse didn't provide another metrics for the tables. If CH will provide new metrics, I'll add them

@AlexKomrakov
Copy link

AlexKomrakov commented Oct 2, 2019

Have an error while using this plugin branch:

2019-10-02T11:45:00Z E! [inputs.clickhouse] Error in plugin: strconv.Atoi: parsing "18446743937525109187": value out of range

Didn't find this number in metrics tables. Mayby it can be cause of negative numbers in system.metrics table
Upd: filter all negative values in queries, error still exists

@kshvakov
Copy link
Author

kshvakov commented Oct 2, 2019

@AlexKomrakov fixed

Copy link
Contributor

@danielnelson danielnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think the choice to rewrite against the HTTP API was a good one. I have quite a few comments but it's mostly style and minor issues. The largest issue to resolve is more comprehensive tests.

var defaultTimeout = 5 * time.Second

var sampleConfig = `
[[inputs.clickhouse]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to get the right generation with telegraf --usage clickhouse, we need to do do a couple clean up tasks:

  • remove the table(section) header
  • remove the file output section
  • add comment for each item, and a newline between each option
  • don't align the value column
  • make sure to show the default value, examples can be shown like:
      ## Static routing key.  Used when no routing_tag is set or as a fallback
      ## when the tag specified in routing tag is not found.
      ##   example: routing_key = "telegraf"
      # routing_key = ""
    
  • try to give defaults for as much as possible, I think we could have them for everything.
  • to set a value default to true (auto_discovery), set the default in the init function.

More style info: https://github.com/influxdata/telegraf/wiki/SampleConfig

Make sure to mention that cluster_include and cluster_exclude must be exact matches, since it would generally be expected that globs could be used here in keeping with other Telegraf filters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielnelson could you look at this PR again, I improved README and SampleConfig with our style guide recommendations please look at af982f4

@sjwang90 sjwang90 added this to the 1.14.0 milestone Jan 13, 2020
@Slach
Copy link
Contributor

Slach commented Feb 16, 2020

[x] Signed CLA
@danielnelson could you look to this pull request again?

@sjwang90 sjwang90 requested a review from ssoroka February 24, 2020 19:06
var sampleConfig = `
[[inputs.clickhouse]]
timeout = 5
servers = ["http://username:[email protected]:8123"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this complicate things if servers have different auth?

inputs.Add("clickhouse", func() telegraf.Input {
return &ClickHouse{
AutoDiscovery: true,
InsecureSkipVerify: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather default this to false. Secure by default!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answer to #6441 (comment)
yes, servers can have a different username and password for authorization, so we shall add the ability to set different authorization parameters for different servers, it's no so much complicated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on 353c73b

Timeout: timeout,
Transport: &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: ch.InsecureSkipVerify,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing the rest of the config; need to be able to specify certificates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the standard tls.ClientConfig options, check the http input for an example. In the sample configuration we would add the options:

  ## Optional TLS Config
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on 353c73b

for _, v := range args {
in = append(in, escape(v))
}
return "cluster " + expr + " (" + strings.Join(in, ", ") + ")"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "cluster " + expr + " (" + strings.Join(in, ", ") + ")"
return fmt.Sprintf("cluster %s (%s)", expr, strings.Join(in, ", "))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on 353c73b

return "WHERE " + makeFilter("IN", ch.ClusterInclude)
}

return "WHERE " + makeFilter("NOT IN", ch.ClusterExclude)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like this code would make it so that you can't use both filters at the same time. shouldn't be too hard to adjust to support both

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on 353c73b

var sampleConfig = `
[[inputs.clickhouse]]
timeout = 5
servers = ["http://username:[email protected]:8123"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does require you to define multiple plugins in that case.

AutoDiscovery bool `toml:"auto_discovery"`
ClusterInclude []string `toml:"cluster_include"`
ClusterExclude []string `toml:"cluster_exclude"`
Timeout int `toml:"timeout"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For timeout use a internal.Duration so that it can be specified as a string: "5s"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on 353c73b

Timeout: timeout,
Transport: &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: ch.InsecureSkipVerify,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the standard tls.ClientConfig options, check the http input for an example. In the sample configuration we would add the options:

  ## Optional TLS Config
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

@Slach
Copy link
Contributor

Slach commented Mar 3, 2020

Dear @ssoroka and @danielnelson could you look to this PR again?

you could test it  with
`telegraf --config plugins/inputs/clickhouse/dev/telegraf_ssl.conf`

Signed-off-by: Slach <[email protected]>
Copy link
Contributor

@danielnelson danielnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's almost perfect, but still a few minor items we need to complete before merging.

.gitignore Outdated
@@ -3,3 +3,4 @@
/telegraf.exe
/telegraf.gz
/vendor
/.idea
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this, we only add files generated by the build system to it. Instead you can add this to your Git core.excludesfile:

$ cat ~/.gitconfig:
[core]
	excludesfile = /home/dbn/.gitignore
$ cat ~/gitignore
tags
*.pyc
*.swp
.noseids

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, fixed on f7bf396

var sampleConfig = `
[[inputs.clickhouse]]
timeout = 5
servers = ["http://username:[email protected]:8123"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done before merging, if the multiple passwords are needed then the plugin can be defined multiple times. On top of consistency with other plugins, it is important to do because this form will cause credentials to be logged, the userinfo section is deprecated, and it is too hard to escape the credentials especially with string passwords.

c.url = &url.URL{
Scheme: u.Scheme,
Host: net.JoinHostPort(c.Hostname, u.Port()),
User: u.User,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass credentials using the header method X-ClickHouse-User, X-ClickHouse-Key. This avoids the credentials being logged as part of error messages and anyway the userinfo section of the url is deprecated (rfc 3986).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on f7bf396

Slach added 2 commits March 4, 2020 13:10
add user and password options instead of use credentials in servers URL
rollback .gitignore

influxdata#6441 (comment)
influxdata#6441 (comment)
influxdata#6441 (comment)
Signed-off-by: Slach <[email protected]>
@Slach
Copy link
Contributor

Slach commented Mar 4, 2020

@danielnelson could you look this PR last time? I hope current implementation looks good

@danielnelson danielnelson merged commit e4caa34 into influxdata:master Mar 4, 2020
@danielnelson
Copy link
Contributor

I'm making a couple minor tweaks post merge:

  • rename user option to username
  • rename hostname tag to source in docs

athoune pushed a commit to bearstech/telegraf that referenced this pull request Apr 17, 2020
veorlo pushed a commit to GlobalNOC/telegraf that referenced this pull request May 4, 2020
idohalevi pushed a commit to idohalevi/telegraf that referenced this pull request Sep 29, 2020
arstercz pushed a commit to arstercz/telegraf that referenced this pull request Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants