Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Kafka clients should do TLS hostname verification except Nodeports #7526

Conversation

see-quick
Copy link
Member

@see-quick see-quick commented Oct 26, 2022

Signed-off-by: see-quick [email protected]

Type of change

  • Bugfix

Description

Currently, our internal Kafka clients do not perform TLS hostname verification on all possible Kubernetes listeners. This PR solves #7518 and enables TLS hostname verification and also disables it in situations where we do not want to execute it (e.g., when we use Nodeport listener, because it does not support TLS hostname verification on ports).

Checklist

  • Make sure all tests pass

@see-quick see-quick added this to the 0.32.0 milestone Oct 26, 2022
@see-quick see-quick self-assigned this Oct 26, 2022
@see-quick
Copy link
Member Author

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Comment on lines 219 to 221
// we disable ssl.endpoint.identification.algorithm for external listener (i.e., Nodeport),
// because TLS hostname verification is not supported on such listener type.
.withAdditionalConfig("ssl.endpoint.identification.algorithm=\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here ... Why does this use Node port listener here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The producer fails on this error:

eadyOps=0] SSL Handshake failed
javax.net.ssl.SSLHandshakeException: No subject alternative names matching IP address 192.168.0.226 found
	at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:353) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:296) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:291) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1357) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1232) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1175) ~[?:?]
	at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392) ~[?:?]
	at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1074) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1061) ~[?:?]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1008) ~[?:?]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not wondering about the disabling of it. I'm wondering why does this use node-ports in the first place as it seems unnecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure...you mean we should rather produce messages to the internal listener instead of producing them via external NodePort?

Because these test cases currently do the following:

  1. Create a bridge consumer to receive messages from the external listener (NodePort)
  2. Create a producer to produce messages via the same listener (NodePort)

So should we modify the second point to use an internal listener instead of an external one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we should think about what the purpose of the test is. i mean I can approve this and we can merge it. But we should look into that in some separate PR - what exactly does this bring us.

Copy link
Member

@im-konge im-konge Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC reason why we are using external producer etc. was because at the time I was writing those "weird username tests", there were no possibilities to use internal clients - as they didn't allow any username to be longer than some number of characters and it shouldn't contain any special characters.

External clients are allowing it.

TBH I don't remember the exact reason why we went this way - maybe there was a different reason, but I guess we can rewrite those test to use our test-clients.

But the rewriting itself should be for a different PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we should think about what the purpose of the test is.

I agree, we definitely could do an investigation into such test cases and find out if they are necessary. :) I can create an issue for that.

@see-quick
Copy link
Member Author

see-quick commented Oct 27, 2022

btw it seems that a few test cases are failing in ListenersST due to the TLS hostname verification...with following errror:

Caused by: javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching my-cluster-82bc9931-kafka-bootstrap found.
	at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:353) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:296) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:291) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1357) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1232) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1175) ~[?:?]
	at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392) ~[?:?]
	at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1074) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1061) ~[?:?]
	at java.security.AccessC

I am surprised about that because the client (producer) is configured to use an internal TLS listener and not Nodeport...

@see-quick
Copy link
Member Author

Update: I have managed to solve issue with the latest commit :)

@see-quick
Copy link
Member Author

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@see-quick see-quick requested a review from Frawless October 27, 2022 13:11
@scholzj scholzj modified the milestones: 0.32.0, 0.33.0 Oct 28, 2022
@see-quick
Copy link
Member Author

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@see-quick
Copy link
Member Author

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: see-quick <[email protected]>
@see-quick
Copy link
Member Author

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@see-quick
Copy link
Member Author

I noticed that the test case testCaRenewalBreakInMiddle is failing for some reason. (but it is not related to my PR). I can create issue for fixing that test case :).

@see-quick
Copy link
Member Author

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for merge Label for PRs which are ready for merge System tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants