-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collecting stats #2369
Comments
I think you could add
[traefik-anonymous-stats]
collect: true
store: /path/to/store (or any other storage)
auto-share: true
keys-to-share:
- volume
- network
- settings |
Not sure to understand the difference 🤔 There is already an option to disable/enable collection in the proposal. What do you mean exactly?
I really think we should stay as simple as possible and this is a bit over-engineered IMHO :)
Same as previous item.
Again, already in the proposal: Detail which data is sent in the documentation
In the proposal, the collected stats are not linked to any bug reporting mechanism. We just want to send some stats at a fixed rate (every day ?). |
As long as the collection is opt-in via config / flags - we'd be more than happy to enable it on some of our servers that use Traefik. |
Here are some more details:
|
Thanks for the writeup @emilevauge, I appreciate the effort to communicate this as clear as possible. May I suggest that perhaps having a way for the users to view the exported data (dump to disk, http endpoint, or otherwise) would perhaps make this less "scary"? Other than that, I think the proposal is good and in my opinion making it an opt-in might gain you next to nothing. As long as it's clearly documented / communicated (your mention of logging the stat-sending action is great) then I think opt-out is very reasonable. |
+1 to what @alkar said, I would feel much happier if I could obtain the exact copy of the export. |
@alkar I totally agree on this. The proposal suggests we could log all the data sent at each collect. Does this match with your need?
|
@emilevauge I sort of missed that point! That sounds good to me, yes, my only concern is really whether it would make logs too noisy in large setups? If that log entry is going to be kilobytes long then maybe it shouldn't include the data (perhaps some users don't even want these statistics to end up in their log aggregation systems - although them being anonymous I don't see that being an issue). It depends on how much data you expect to collect and the collection frequency, I suppose. Personally, I'm not too fussed about the means of "inspection" as long as there's the option. HTH |
I agree, a to verbose log can be an issue, but a option to log stats to a separated file will be fine |
@alkar the logged data will not be large. It's only the static configuration. And BTW, we already log the configuration in JSON when Traefik starts. |
Sounds great then! |
I'm ok with opt-out. Folks who care are probably also the ones who are most willing/able to figure out how to set the opt-out parm. You might start as opt-in with a lot of documentation about how it's going to be opt-out in a future release. |
For even more transparency, I suggest we add some information on the webui when collection is enabled. Proposal updated. |
Happy to help with some real life data, however:
|
I totally agree, this is why I wrote in first place the fact that is important to have the possibility to: just collect data - then at least 2 options to: auto-send (or not) as well other one to dump it to a file to further manual inspect+send. Just check my first comment on this topic, you'll see that I've covered the main things which are important in Enterprise Environment. |
opt-out with clear mention + easy and documented way to turn it off. |
Hello there, a PR has just been opened on this #2447 :) |
Closed by #2447 |
Following #2172, this is a proposal to discuss with the community on how we can get more information about deployed Traefik instances.
Why do the developers team need more info ?
As you may know, the Traefik core development team is quite small and as with a lot of open source projects, we lack time and resources. As a consequence, we have to carefully choose which tasks and features need our attention. As a result, we usually invest our time on features needed or requested by most of the community. In order to efficiently do this, we have to know how our community uses Traefik.
So far we have been using feedback from our users on Slack and Github but we definitely need more details on usage.
To give an illustrative example, we have no way to know which configuration backend is the most used or which configuration backend is used by the least number of people. What if we discover that we maintain a configuration backend that is largely unused? Knowing this, we could have allocated our resource on something more useful, especially since we have a lot of useful things we can work on ;)
Another example is that we have no idea of release adoption/implementation. Having this knowledge would help us to adapt our development cycle to benefit adoption. We don't need or want to release every month if users are waiting for 2 months before updating to the latest release.
We just need to know what is used, and what is not.
What we propose
Ideally, we would like statistics on the toml/flags Traefik configuration and Traefik versions our users are using. The toml/flags configuration would allow the development team to know what is used in Traefik and what is not.
Only export what's needed.
We already use a mechanism to export the whole configuration when using the
traefik bug
command. But it only exports what's required for bug diagnostics. It only export specifically tagged configuration fields. Furthermore, all the private data (IPs, email address, etc) are not exported as they are not tagged in the code (with struct tags).We could reuse this in this stats collection mechanism.
What's great with this solution is that exported configuration fields are hard-coded. Each time a new field will be added in the configuration, by default, it will not be exported. We will need to tag it in the code to export it. This allows us to carefully review what's being exported and what's not in future configuration changes and this can be reviewed by the community before implementation.
Collected configuration fields are hard-coded.
Opt-in vs. Opt-out
Another topic we need to discuss is do we make it opt-in or opt-out?
The easiest way would be to set it opt-in: if you want to export your config, you need to enable it in your configuration.
The major downside of this is that we have doubts as to whether users will enable the data collection by themselves. This could lead to a useless feature for the developers team as the whole point of this is to get a good idea of how Traefik is used. We need a certain amount of feedback to get relevant data. Further, we think that only advanced/active users in the community would enable this option and collected data would be biased.
Our ideal goal would be to make it opt-out. But we don't want to scare our community with this :'(. This is the best solution for the developers teams, but it is only going to be possible if users are confident on the collection mechanism and if things are done transparently.
Transparency & Trust
We want to be as transparent as possible on this. Here are few principles we aim to follow:
Stats collection is enabled. Help us improve Traefik by leaving this feature on :) More details on https://docs.traefik.io/basics/#stats-collect
Stats sent on https://collect.traefik.io: {DATA}
How could you help ?
The best thing you can do is voice your opinion about this :) We need your feedback, your ideas, your constructive criticism. Help us build a mechanism that will give the developer team a better idea of how is used Traefik and focus on what matters, while still working for you and your businesses.
The text was updated successfully, but these errors were encountered: