-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression: crash when TLS insecure is set on authenticators #6619
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@Depechie can you please enable debug logging and share the agent collector logs. I have quickly run 0.64.1 on my machine with custom oauth server it works fine. |
Hey @pavankrish123
But is that enough, because the logs don't seem to be different?
|
@pavankrish123 If it would help, I can also pass all the oauth settings data... it is just a test app registration in Azure. Just let me know if that is needed or not. |
Thanks @Depechie. I was looking for these grpc logs 2022-11-17T21:51:55.029Z info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel created {"grpc_log": true}
2022-11-17T21:51:55.029Z info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel Connectivity change to SHUTDOWN {"grpc_log": true}
2022-11-17T21:51:55.029Z info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel deleted {"grpc_log": true}
2022-11-17T21:51:55.029Z info service/service.go:115 Starting shutdown... Is it possible to enable grpc debug logging by setting these environment variables as specified `GRPC_GO_LOG_VERBOSITY_LEVEL=99 GRPC_GO_LOG_SEVERITY_LEVEL=info`. Also Quick question, curious any reason why we have inferred that oauth2client extension is causing the crash? logs indicate that shutdown of the exporter is triggering the issue - that said something is causing the shutdown |
Well @pavankrish123 if I create 2 ( agent & server ) services in the docker compose file that just leave out the
part, everything just works. I will try adding the environment variable, this is done at docker container level I suppose? |
Logs:
|
Thanks @Depechie I think I can reproduce the error, need to get to the bottom of it - I am guessing recent changes to core collector service lifecycle management is causing this issue. Just FYI, I was able to reproduce this on local machine directly ./otelcol-contrib --config otel-agent.yml
extensions:
oauth2client:
client_id: agent
client_secret: ******
token_url: http://localhost:8080/auth/realms/opentelemetry/protocol/openid-connect/token
receivers:
otlp:
protocols:
grpc:
exporters:
otlp/auth:
endpoint: myserver:5000
tls:
insecure: true
auth:
authenticator: oauth2client
service:
telemetry:
logs:
level: "debug"
extensions:
- oauth2client
pipelines:
traces:
receivers:
- otlp
exporters:
- otlp/auth caused the entire process to crash ./otelcol-contrib --config otel-agent.yml
2022/11/18 01:51:18 proto: duplicate proto type registered: jaeger.api_v2.PostSpansRequest
2022/11/18 01:51:18 proto: duplicate proto type registered: jaeger.api_v2.PostSpansResponse
2022-11-18T01:51:19.294-0800 info service/telemetry.go:110 Setting up own telemetry...
2022-11-18T01:51:19.296-0800 info service/telemetry.go:140 Serving Prometheus metrics {"address": ":8888", "level": "basic"}
2022-11-18T01:51:19.297-0800 debug components/components.go:28 Stable component. {"kind": "exporter", "data_type": "traces", "name": "otlp/auth", "stability": "stable"}
2022-11-18T01:51:19.298-0800 debug components/components.go:28 Stable component. {"kind": "receiver", "name": "otlp", "pipeline": "traces", "stability": "stable"}
2022-11-18T01:51:19.299-0800 info service/service.go:89 Starting otelcol-contrib... {"Version": "0.64.1", "NumCPU": 12}
2022-11-18T01:51:19.299-0800 info extensions/extensions.go:41 Starting extensions...
2022-11-18T01:51:19.299-0800 info extensions/extensions.go:44 Extension is starting... {"kind": "extension", "name": "oauth2client"}
2022-11-18T01:51:19.299-0800 info extensions/extensions.go:48 Extension started. {"kind": "extension", "name": "oauth2client"}
2022-11-18T01:51:19.299-0800 info pipelines/pipelines.go:74 Starting exporters...
2022-11-18T01:51:19.299-0800 info pipelines/pipelines.go:78 Exporter is starting... {"kind": "exporter", "data_type": "traces", "name": "otlp/auth"}
2022-11-18T01:51:19.299-0800 info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel created {"grpc_log": true}
2022-11-18T01:51:19.300-0800 info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel Connectivity change to SHUTDOWN {"grpc_log": true}
2022-11-18T01:51:19.300-0800 info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel deleted {"grpc_log": true}
2022-11-18T01:51:19.300-0800 info service/service.go:115 Starting shutdown...
2022-11-18T01:51:19.300-0800 info pipelines/pipelines.go:118 Stopping receivers...
2022-11-18T01:51:19.300-0800 info pipelines/pipelines.go:125 Stopping processors...
2022-11-18T01:51:19.300-0800 info pipelines/pipelines.go:132 Stopping exporters...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x15bc0ab]
goroutine 1 [running]:
google.golang.org/grpc.(*ClientConn).Close(0x0)
google.golang.org/[email protected]/clientconn.go:1016 +0x4b
go.opentelemetry.io/collector/exporter/otlpexporter.(*exporter).shutdown(0xc000e487d0?, {0x9?, 0x962cd01?})
go.opentelemetry.io/collector/exporter/[email protected]/otlp.go:93 +0x1d
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
go.opentelemetry.io/[email protected]/component/component.go:91
go.opentelemetry.io/collector/exporter/exporterhelper.newBaseExporter.func2({0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/exporter/exporterhelper/common.go:177 +0x5a
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
go.opentelemetry.io/[email protected]/component/component.go:91
go.opentelemetry.io/collector/service/internal/pipelines.(*Pipelines).ShutdownAll(0xc0000c18b0, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/internal/pipelines/pipelines.go:135 +0x36b
go.opentelemetry.io/collector/service.(*service).Shutdown(0xc000633800, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/service.go:121 +0xd4
go.opentelemetry.io/collector/service.(*Collector).shutdownServiceAndTelemetry(0xc0015fba88, {0x7e544c0?, 0xc00019e000?})
go.opentelemetry.io/[email protected]/service/collector.go:234 +0x36
go.opentelemetry.io/collector/service.(*Collector).setupConfigurationComponents(0xc0015fba88, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/collector.go:155 +0x286
go.opentelemetry.io/collector/service.(*Collector).Run(0xc0015fba88, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/collector.go:164 +0x46
go.opentelemetry.io/collector/service.NewCommand.func1(0xc00063d200, {0x71e628b?, 0x2?, 0x2?})
go.opentelemetry.io/[email protected]/service/command.go:53 +0x479
github.com/spf13/cobra.(*Command).execute(0xc00063d200, {0xc00019a190, 0x2, 0x2})
github.com/spf13/[email protected]/command.go:916 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc00063d200)
github.com/spf13/[email protected]/command.go:1044 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/[email protected]/command.go:968
main.runInteractive({{0xc00107a4e0, 0xc00107b6b0, 0xc00107a900, 0xc0004199e0}, {{0x720c2dc, 0xf}, {0x7283ffd, 0x1f}, {0x71e023a, 0x6}}, ...})
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:32 +0x5d
main.run(...)
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main_others.go:11
main.main()
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:25 +0x1d8 cc: @jpkrohling |
As a matter of fact any client auth extensions is causing the crash :( example 1: enabling basic auth extension is also causing the same crash extensions:
# oauth2client:
# client_id: agent
# client_secret: AkW04qQhMavsbEzGmNofSsr576Ye3IVg
# token_url: http://localhost:8080/auth/realms/opentelemetry/protocol/openid-connect/token
basicauth/client:
client_auth:
username: username
password: password
receivers:
otlp:
protocols:
grpc:
exporters:
otlp/auth:
endpoint: otel-server:4317
tls:
insecure: true
auth:
authenticator: basicauth/client
service:
telemetry:
logs:
level: "debug"
extensions:
- basicauth/client
pipelines:
traces:
receivers:
- otlp
exporters:
- otlp/auth $ ./otelcol-contrib --config otel-agent.yml
2022/11/18 02:34:13 proto: duplicate proto type registered: jaeger.api_v2.PostSpansRequest
2022/11/18 02:34:13 proto: duplicate proto type registered: jaeger.api_v2.PostSpansResponse
2022-11-18T02:34:13.783-0800 info service/telemetry.go:110 Setting up own telemetry...
2022-11-18T02:34:13.783-0800 info service/telemetry.go:140 Serving Prometheus metrics {"address": ":8888", "level": "basic"}
2022-11-18T02:34:13.784-0800 debug components/components.go:28 Stable component. {"kind": "exporter", "data_type": "traces", "name": "otlp/auth", "stability": "stable"}
2022-11-18T02:34:13.784-0800 debug components/components.go:28 Stable component. {"kind": "receiver", "name": "otlp", "pipeline": "traces", "stability": "stable"}
2022-11-18T02:34:13.784-0800 info service/service.go:89 Starting otelcol-contrib... {"Version": "0.64.1", "NumCPU": 12}
2022-11-18T02:34:13.784-0800 info extensions/extensions.go:41 Starting extensions...
2022-11-18T02:34:13.784-0800 info extensions/extensions.go:44 Extension is starting... {"kind": "extension", "name": "basicauth/client"}
2022-11-18T02:34:13.784-0800 info extensions/extensions.go:48 Extension started. {"kind": "extension", "name": "basicauth/client"}
2022-11-18T02:34:13.784-0800 info pipelines/pipelines.go:74 Starting exporters...
2022-11-18T02:34:13.784-0800 info pipelines/pipelines.go:78 Exporter is starting... {"kind": "exporter", "data_type": "traces", "name": "otlp/auth"}
2022-11-18T02:34:13.784-0800 info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel created {"grpc_log": true}
2022-11-18T02:34:13.784-0800 info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel Connectivity change to SHUTDOWN {"grpc_log": true}
2022-11-18T02:34:13.784-0800 info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel deleted {"grpc_log": true}
2022-11-18T02:34:13.784-0800 info service/service.go:115 Starting shutdown...
2022-11-18T02:34:13.784-0800 info pipelines/pipelines.go:118 Stopping receivers...
2022-11-18T02:34:13.784-0800 info pipelines/pipelines.go:125 Stopping processors...
2022-11-18T02:34:13.784-0800 info pipelines/pipelines.go:132 Stopping exporters...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x15bc0ab]
goroutine 1 [running]:
google.golang.org/grpc.(*ClientConn).Close(0x0)
google.golang.org/[email protected]/clientconn.go:1016 +0x4b
go.opentelemetry.io/collector/exporter/otlpexporter.(*exporter).shutdown(0xc0010d0ad0?, {0x9?, 0x962cd01?})
go.opentelemetry.io/collector/exporter/[email protected]/otlp.go:93 +0x1d
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
go.opentelemetry.io/[email protected]/component/component.go:91
go.opentelemetry.io/collector/exporter/exporterhelper.newBaseExporter.func2({0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/exporter/exporterhelper/common.go:177 +0x5a
go.opentelemetry.io/collector/component.ShutdownFunc.Shutdown(...)
go.opentelemetry.io/[email protected]/component/component.go:91
go.opentelemetry.io/collector/service/internal/pipelines.(*Pipelines).ShutdownAll(0xc0011f0190, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/internal/pipelines/pipelines.go:135 +0x36b
go.opentelemetry.io/collector/service.(*service).Shutdown(0xc000cad300, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/service.go:121 +0xd4
go.opentelemetry.io/collector/service.(*Collector).shutdownServiceAndTelemetry(0xc0012bfa88, {0x7e544c0?, 0xc00019e000?})
go.opentelemetry.io/[email protected]/service/collector.go:234 +0x36
go.opentelemetry.io/collector/service.(*Collector).setupConfigurationComponents(0xc0012bfa88, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/collector.go:155 +0x286
go.opentelemetry.io/collector/service.(*Collector).Run(0xc0012bfa88, {0x7e544c0, 0xc00019e000})
go.opentelemetry.io/[email protected]/service/collector.go:164 +0x46
go.opentelemetry.io/collector/service.NewCommand.func1(0xc0001d6c00, {0x71e628b?, 0x2?, 0x2?})
go.opentelemetry.io/[email protected]/service/command.go:53 +0x479
github.com/spf13/cobra.(*Command).execute(0xc0001d6c00, {0xc00019a190, 0x2, 0x2})
github.com/spf13/[email protected]/command.go:916 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc0001d6c00)
github.com/spf13/[email protected]/command.go:1044 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/[email protected]/command.go:968
main.runInteractive({{0xc00085c750, 0xc00085d920, 0xc00085cb70, 0xc00085c3f0}, {{0x720c2dc, 0xf}, {0x7283ffd, 0x1f}, {0x71e023a, 0x6}}, ...})
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:32 +0x5d
main.run(...)
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main_others.go:11
main.main()
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:25 +0x1d8 @Depechie can you please validate this in your end and adjust the bug description accordingly. |
@pavankrish123 confirmed |
@Depechie an update If you remove So here is what is happening setting Logs from 0.62.0 version otel-agent | 2022-11-18T17:00:12.673Z info service/service.go:112 Starting otelcol-contrib... {"Version": "0.60.0", "NumCPU": 8}
otel-agent | 2022-11-18T17:00:12.673Z info extensions/extensions.go:42 Starting extensions...
otel-agent | 2022-11-18T17:00:12.673Z info extensions/extensions.go:45 Extension is starting... {"kind": "extension", "name": "oauth2client"}
otel-agent | 2022-11-18T17:00:12.673Z info extensions/extensions.go:49 Extension started. {"kind": "extension", "name": "oauth2client"}
otel-agent | 2022-11-18T17:00:12.673Z info pipelines/pipelines.go:74 Starting exporters...
otel-agent | 2022-11-18T17:00:12.673Z info pipelines/pipelines.go:78 Exporter is starting... {"kind": "exporter", "data_type": "traces", "name": "otlp/auth"}
otel-agent | 2022-11-18T17:00:12.673Z info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel created {"grpc_log": true}
otel-agent | 2022-11-18T17:00:12.673Z info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel Connectivity change to SHUTDOWN {"grpc_log": true}
otel-agent | 2022-11-18T17:00:12.673Z info zapgrpc/zapgrpc.go:174 [core] [Channel open-telemetry/opentelemetry-collector-contrib#1] Channel deleted {"grpc_log": true}
otel-agent | Error: cannot start pipelines: grpc: the credentials require transport level security (use grpc.WithTransportCredentials() to set)
otel-agent | 2022/11/18 17:00:12 collector server run finished with error: cannot start pipelines: grpc: the credentials require transport level security (use grpc.WithTransportCredentials() to set) For some reason instead of failing gracefully because exporter must not be insecure as usual, this version of the collector is crashing now. Please remove exporter insecure and try out. |
Well... I would love to try that. But on local docker container install, without insecure the stack does not work |
@Depechie please refer to this blog written by our friend @jpkrohling on how to create some dummy certs is You see the collector refuses to connect over plain text channel. |
Ok will try that next week :) thx for the details!! |
Any guidance on how to get this done with docker compose?
And sometimes
I tried:
|
@pavankrish123, I don't think we should be crashing the collector when the "tls.insecure" option is set, even if the transport requires confidentiality. Given you debugged this already, would you be OK with opening a PR fixing that? I can then review the PR. |
@jpkrohling and @pavankrish123 is there any possibility you are able to help out with that local sample I'm trying to work out? In the meantime? If not no worries... |
Thanks @jpkrohling - will issue a PR soon. Looks like a regression, the fix probably is in the core. Also can we please remove oauth2extension from the labels.
|
Will try few things on my end and get back to you soon @Depechie. Been busy last couple of days. |
I'm not familiar with Docker compose enough to be helpful. If you can reproduce the problem using your local machine instead, I can probably help you. |
I moved this to the collector core repository. I don't think this warrants a patch release, as this is triggered only on invalid configuration options. |
The issue is now fixed. We can close this. |
Component(s)
extension/oauth2clientauth
What happened?
Description
Trying to have 2 opentelemetry collectors talk to each other using the oauth2client extension ( agent < > server ).
Steps to Reproduce
Docker compose file
With 2 collector configs defined
Expected Result
Correct data transfer
Actual Result
The opentelemetry configured as agent keeps crashing
Collector version
v0.64.1
Environment information
Environment
OS: Windows with WSL2 Ubuntu
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: