Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration Validations must validate all fields before returning an error #4613

Closed
MovieStoreGuy opened this issue Dec 29, 2021 · 19 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@MovieStoreGuy
Copy link
Contributor

Is your feature request related to a problem? Please describe.
A lot of the configuration validate methods will return early stating a singular problem of the configuration, this can lead to a rather long feedback loop addressing issues within the configuration within complicated configurations.

Describe the solution you'd like
Each validate functions must validate all errors and only return an error after it has checked / validated all required fields.

Describe alternatives you've considered
I believe the alternative is leave it as it is or RFTM ??

Additional context
As someone that struggles with dyslexia, it takes me a few times to make sure what I typed is what I wanted, and having to reread the configuration on each failed attempted is rather pain staking process.

@jpkrohling jpkrohling added enhancement New feature or request good first issue Good for newcomers labels Jan 3, 2022
@jpkrohling
Copy link
Member

Should this be made a tracker, listing all the places where this should be done? There aren't that many components in this repo, but contrib might be affected as well.

@bogdandrutu
Copy link
Member

@MovieStoreGuy too many errors may also be a problem and confusion, I am not sold on the "print" all errors is the right approach. I remember the old gcc which did that and was extremely hard to identify the root cause, since an error may cause a chain of errors.

I would say probably one of the first thing we should add a "--dry run" (flag) or "validate" (command) that can return immediately and be used to validate the config instead of having to run the entire binary.

@MovieStoreGuy
Copy link
Contributor Author

My original thought is that to use multierr and expand each error line by line and I can visually understand the amount of things I had gotten wrong so I can see if I need to update one things, or a lot of things (assuming the config is valid yaml).

Something of the vein of:

pipeline: no exporter defined `signalxf`
pipeline: no exporter defined `jager`
exporter: no exporter registered named `statd`
receiver: statsd: listen address is not set
extension: asap: no valid key provided
extension: asap: no keyserver provided

I mainly noticed this as I was cleaning up the contrib lifecycle tests trying to figure out what is the minimal required config to make it work.

I would say probably one of the first thing we should add a "--dry run" (flag) or "validate" (command) that can return immediately and be used to validate the config instead of having to run the entire binary.

I am happy for this to be the first step to getting there, though I would want to make sure it check each entry within the configuration otherwise, it isn't much better than what we have now.

@MovieStoreGuy
Copy link
Contributor Author

Should this be made a tracker, listing all the places where this should be done? There aren't that many components in this repo, but contrib might be affected as well.

I wanted to iron out what would this would look like within the collector and once we agree on the approach, I can create a follow up issue to update config validate methods to do what is required.

@bogdandrutu
Copy link
Member

bogdandrutu commented Jan 4, 2022

@MovieStoreGuy since I have lots of knowledge in the internals probably for me is easier with just one (and also because of my previous bad experiences with parsers when getting one thing wrong can cause lots of chained errors. That being said I think this decision should be driven by the users, so we should get few +1 from users I am fine to do this way.

When it comes to "validate", I have a better idea (I think), we should have a ./otelcol --dry-run -config=.... to print the resolved config nicely (could use different command/flag like print). This will definitely help when people are using remote configs or the experimental (yet to be ready) config sources.

@Chinwendu20
Copy link
Contributor

Chinwendu20 commented Oct 13, 2022

Hello, please what is the verdict on this issue? @bogdandrutu . Would we be creating a --dry-run flag?
Could you also please help give me a visual representation of what you mean by resolved config. Is it like a "cat" of the config file?

@Chinwendu20
Copy link
Contributor

Hello @jpkrohling could you help answer my concerns about this issue?

@bogdandrutu
Copy link
Member

I would say:

  1. Add the "--dry-run" flag that only validates the config and prints all the errors (maybe all the errors across all components).
  2. Decide if we want to print the config result as well, or add a new command/flag to do that. I personally believe if "--dry-run" finds no errors then it should print the config.

@jpkrohling
Copy link
Member

@bogdandrutu's comment makes sense to me.

@Chinwendu20
Copy link
Contributor

Okay, thank you. I will work on this issue

@Chinwendu20
Copy link
Contributor

Hello. I was investigating this issue and I ran the default collector distribution with a faulty config and I discovered
"all" errors come out as output anyway

image

@bogdandrutu
Copy link
Member

Hello. I was investigating this issue and I ran the default collector distribution with a faulty config and I discovered
"all" errors come out as output anyway

That is good, but we need to add the "--dry-run" flag, that will return if the config is valid instead of running the collector.

@Chinwendu20
Copy link
Contributor

Okay thank you

@Chinwendu20
Copy link
Contributor

Chinwendu20 commented Oct 24, 2022

Thanks a lot for your help. Hello @bogdandrutu . I looked further into this and the above screenshot it looks like it returns all the errors because all the errors were related to the top level confguration. I followed the journey from setupConfigurationComponents to configProvider.Get
If there is a marshal error it quickly returns without checking the config validations

This also applies to individual config validations and individual unmarshaling

I am thinking of creating an error variable that would be imported across these files. In the dry run function, I would print out all the errors if available, what do you think?

@tigrannajaryan
Copy link
Member

To add to the above. It would be great if we could print errors about configuration in a way that makes it easier to understand what needs fixing by using some simple ascii graphics, e.g.:

Configuration errors found:

Error 1
=======

extensions:
  opamp/name:
  ^---^--- unknown extension type "opamp".


Error 2
=======

receivers:
  filelog:
    ^--- filelog receiver must have a child "include" setting.   

Error 3
=======
exporters:
  otlphttp:
    endpoint: file://badurl/
              ^------------^--- invalid URL specified for "endpoint" setting.

The detailed info to print this ascii graphics can be returned in a custom error (we would need to indicate whether the error is in the key or in the value, optionally which part of the key or value is from, including character offsets, etc).

Unfortunately we can't reference original line and column numbers since they are unknown after merging and unmarshaling, but we can reference the hierarchical keys which can still be very helpful.

@Chinwendu20
Copy link
Contributor

Chinwendu20 commented Oct 29, 2022

Oh wow I was actually working on this issue as indicated above, spent a substantial amount of time on it, and even implemented @tigrannajaryan suggestion, currently writing test and about to push a PR

Chinwendu20 added a commit to Chinwendu20/opentelemetry-collector that referenced this issue Oct 30, 2022
@Chinwendu20
Copy link
Contributor

Hello, all please I am in the process of writing a test for my PR but am getting this error from the existing test, any suggestions please? I have reverted to the previous state the collector test was before my commits. Yet I still see this:
` collector_test.go:585:
Error Trace: C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_test.go:585
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:392
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:405
Error: Should NOT be empty, but was
Test: TestCollectorStartWithOpenCensusMetrics/override_service.version
Messages: mandatory label "service_instance_id" not present
collector_test.go:585:
Error Trace: C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_test.go:585
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:392
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:405
Error: Should NOT be empty, but was
Test: TestCollectorStartWithOpenCensusMetrics/override_service.version
Messages: mandatory label "service_instance_id" not present
collector_test.go:585:
Error Trace: C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_test.go:585
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:392
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:405
Error: Should NOT be empty, but was
Test: TestCollectorStartWithOpenCensusMetrics/override_service.version
Messages: mandatory label "service_instance_id" not present
collector_test.go:585:
Error Trace: C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_test.go:585
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:392
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:405
Error: Should NOT be empty, but was
Test: TestCollectorStartWithOpenCensusMetrics/override_service.version
Messages: mandatory label "service_instance_id" not present
collector_test.go:585:
Error Trace: C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_test.go:585
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:392
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:405
Error: Should NOT be empty, but was
Test: TestCollectorStartWithOpenCensusMetrics/override_service.version
Messages: mandatory label "service_instance_id" not present
collector_test.go:585:
Error Trace: C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_test.go:585
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:392
C:\Users\Chinwendum\devenvironment\collector-core-add-dry-run-flag-validate-all-fields\service\collector_te
st.go:405
Error: Should NOT be empty, but was
Test: TestCollectorStartWithOpenCensusMetrics/override_service.version
Messages: mandatory label "service_instance_id" not present

Message otel-collector

`

codeboten pushed a commit to codeboten/opentelemetry-collector that referenced this issue Jun 6, 2023
codeboten pushed a commit to codeboten/opentelemetry-collector that referenced this issue Jun 7, 2023
codeboten pushed a commit that referenced this issue Jun 7, 2023
This is bringing back the functionality initially submitted in @Chinwendu20's PR: #6445

Adds a validate method which validates the configuration

Link to tracking Issue: #4613

---------

Signed-off-by: Alex Boten <[email protected]>
Co-authored-by: Maureen <[email protected]>
@drewdaemon
Copy link

Was this issue resolved in #7835?

@atoulme
Copy link
Contributor

atoulme commented Dec 19, 2023

Yes, closing.

@atoulme atoulme closed this as completed Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

7 participants