Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adds periodical http config reloads and change detection #10102

Conversation

toni-moreno
Copy link
Contributor

Required for all PRs:

resolves #8730

Added flags to config how connect to the remote http(s) server , how to download with witch interval, these flags are provided.

These flags apply only on http based remote config

  --watch-interval <interval>    Interval to monitor http based config files ( default 0 = deactivated),
                                 it sets a backgroup process continuously checking for new config files
                                 each <interval> duration. Server side should control if changed,
                                 a HTTP 200 (OK) response will mean there is a change since last download.
                                 a HTTP 304 (Not modified) will mean no changes
  --watch-jitter <jitter>        time variation to ensure avoid all agents downloading the config
                                 file from the server hosting it at the same time (default 10s)
                                 (only used if --watch-interval is set)
  --watch-retry-interval <interval> time in seconds to retry download config if download failed
                                  (default 20s)
  --watch-max-retries <retries>   number of retries to download config file if previously failed (default 3)
  --watch-tls-cert  <path>        Certificate File path for TLS Config on HTTP(S) Config downloads
  --watch-tls-key   <path>        Certificate Key File path for TLS Config on HTTP(S) Config downloads
  --watch-tls-key-pwd <password>  Password to decode Key file.
  --watch-tls-ca    <path>        CA File path for TLS Config on HTTP(S) Config downloads
  --watch-tls-sni   <name>        SNI(Server Name Indication) indicates which hostname it is attempting
                                  to connect to at the start of the TLS handshaking process
  --watch-insecure-skip-verify    If set this flag we use TLS but skip chain & host verification
                                  (default false)

Change detection are delegated to the Server Side by sending request with If-Modified-Since header, Server should answer 200 if changed and 304 if not

@toni-moreno toni-moreno force-pushed the feature/reload_remote_config_on_interval branch 5 times, most recently from b51a892 to 3265e9e Compare November 14, 2021 13:56
@toni-moreno
Copy link
Contributor Author

Hello , @sspaink I'm working to finish this PR in order to be merged ASAP. But I've found a couple of problems related to something external to this PR.

1) Lint Code Base / Lint Code Base (pull_request)

There is a problem while downloading the linter image on the ci engine .
image

2) test-go-1_17-386

Test has problems on the execd plugin.

image

What should I do now? Could you please help me to fix it please?

@sspaink
Copy link
Contributor

sspaink commented Nov 15, 2021

Hello @toni-moreno, the linter issue has been resolved by this PR: #10108 there is an issue with the Docker image of the latest version so we reverted back to an older one in the meantime. The execd unit test failure I am not sure about, this hasn't been a flaky test before so I would assume it is due to a change in this PR.

@toni-moreno
Copy link
Contributor Author

@sspaink , is really strange , the error where in a timeout on the execd , all my local test where all ok and execd processor is not working with the config.XXXXX funcionts that I modified.

image

Anyway I will rebase to fix the conflic file and check again.

There is any way to simulate de test-go1_17-386 in my local environment?

@toni-moreno toni-moreno force-pushed the feature/reload_remote_config_on_interval branch 4 times, most recently from 1d90c12 to 720c186 Compare November 16, 2021 23:29
@toni-moreno toni-moreno force-pushed the feature/reload_remote_config_on_interval branch from 720c186 to 3d632b3 Compare November 17, 2021 08:14
@telegraf-tiger
Copy link
Contributor

@toni-moreno
Copy link
Contributor Author

@sspaink hello I've just finished the checks and a rebase. It is ready to review and merge. Could you please?

@powersj
Copy link
Contributor

powersj commented Nov 18, 2021

@toni-moreno,

Thanks for taking the time to put this together. This feature is clearly in demand and we realize we need to provide some mechanism to provide this feature in Telegraf. I put a few thoughts in #8730 around use-cases and errors we absolutely need to handle before we add support for this in the CLI. I think this MP covers a few of them, but I want the team to take a more thorough look as well before we sign on to support these new CLI options.

@toni-moreno
Copy link
Contributor Author

@powersj let me know if you need something. I would like to use in our environment ASAP and I would like to see merged.

@toni-moreno
Copy link
Contributor Author

toni-moreno commented Dec 1, 2021

@sspaink , @powersj Hello. I would like if this PR could be merged soon , or it needs for changes or other features to be implemented maybe?

@Hipska
Copy link
Contributor

Hipska commented Jun 28, 2022

Hi @toni-moreno, sorry this got lost in the pile. Would you be able to merge in master branch to fix the conflicts?

@Trovalo
Copy link
Collaborator

Trovalo commented Jul 11, 2022

I've tested the PR and got the following error by just passing the config via URL
command:
telegraf_HttpConfRefresh.exe -config 'https://raw.githubusercontent.com/Trovalo/MyStuff/main/tempfile.conf' --test

Error:

2022-07-11T12:08:39Z I! Starting Telegraf
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x20 pc=0xcf9e8e]

goroutine 1 [running]:
github.com/influxdata/telegraf/config.fetchConfig(0xc0000721e0?, 0x0)
        C:/Projects/git/telegraf/config/config.go:962 +0x6e
github.com/influxdata/telegraf/config.loadConfig({0xc0000721e0, 0x44}, 0x30?)
        C:/Projects/git/telegraf/config/config.go:949 +0xc6
github.com/influxdata/telegraf/config.(*Config).LoadConfig(0x4b496ec?, {0xc0000721e0?, 0xc000a97a90?}, 0x2030002?)
        C:/Projects/git/telegraf/config/config.go:761 +0x71
main.runAgent({0x5464cc0, 0xc000a38e80}, {0x7a74328, 0x0, 0x0}, {0x7a74328, 0x0, 0x0})
        C:/Projects/git/telegraf/cmd/telegraf/telegraf.go:284 +0x11a5
main.reloadLoop({0x7a74328, 0x0, 0x0}, {0x7a74328, 0x0, 0x0})
        C:/Projects/git/telegraf/cmd/telegraf/telegraf.go:180 +0x4b2
main.run({0x7a74328, 0x0, 0x0}, {0x7a74328, 0x0, 0x0})
        C:/Projects/git/telegraf/cmd/telegraf/telegraf_windows.go:26 +0xcc
main.main()
        C:/Projects/git/telegraf/cmd/telegraf/telegraf.go:523 +0x96b

I'm running Telegraf on WIndows, the standard build works fine.

@Hipska Hipska added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin area/configuration area/agent waiting for response waiting for response from contributor labels Jul 11, 2022
@telegraf-tiger
Copy link
Contributor

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Page. Thank you!

@telegraf-tiger telegraf-tiger bot closed this Jul 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/agent area/configuration feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin waiting for response waiting for response from contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Telegraf reloads URL-based/remote config on a specified interval
5 participants