Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Fix expiration of cert chains calculation #33231

Merged
merged 8 commits into from
Oct 11, 2022

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Sep 30, 2022

Fixes #33215

This is a bit WIP in that I'm not certain this is the final logic we want. It appears the elastic agent libs override the default golang TLS implementation's verification and only set the cert chains attribute on the connection when strict validation is enabled, where the default is full. See https://www.elastic.co/guide/en/beats/heartbeat/current/configuration-ssl.html for definition of the various validation options.

I suspect we'd be best off making strict the default since this is what chrome does nowadays, requiring a SAN to be set, which is the only distinction between the checks of strict and full.

To test this use the following config, which returns a too-early expiration date without this patch:

- type: http
  enabled: true
  id: wyportal
  name: WYPortal
  schedule: "@every 1m"
  ssl.verification_mode: strict
  urls: "https://portals.edu.wyoming.gov/"

You can try playing with the verification mode as well

What does this PR do?

Why is it important?

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Use the above config and run and build with the following command to prune the output. Note this requires your heartbeat yaml to not have an output section set.

mage build && ./heartbeat -e  -E output.console={} | jq '{"id": .monitor.id, "na": .tls.server.x509.not_after}'

Fixes elastic#33215

This is a bit WIP in that I'm not certain this is the final logic we want. It appears the elastic agent libs override the default golang TLS implementation's verification and only set the cert chains attribute on the connection when strict validation is enabled, where the default is `full`. See https://www.elastic.co/guide/en/beats/heartbeat/current/configuration-ssl.html for definition of the various validation options.

I suspect we'd be best off making strict the default since this is what chrome does nowadays, requiring a SAN to be set, which is the only other distinction.

To test this use the following config, which returns a too-early expiration date without this patch:

```
- type: http
  enabled: true
  id: wyportal
  name: WYPortal
  schedule: "@every 1m"
  ssl.verification_mode: strict
  urls: "https://portals.edu.wyoming.gov/"
```

You can try playing with the verification mode as well
@andrewvc andrewvc added bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Sep 30, 2022
@andrewvc andrewvc self-assigned this Sep 30, 2022
@andrewvc andrewvc requested a review from a team as a code owner September 30, 2022 03:05
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 30, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

@mergify
Copy link
Contributor

mergify bot commented Sep 30, 2022

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @andrewvc? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 30, 2022
@andrewvc andrewvc marked this pull request as draft September 30, 2022 03:06
@elasticmachine
Copy link
Collaborator

elasticmachine commented Sep 30, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-10-11T14:15:49.652+0000

  • Duration: 46 min 44 sec

Test stats 🧪

Test Results
Failed 0
Passed 1869
Skipped 25
Total 1894

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@andrewvc andrewvc marked this pull request as ready for review October 4, 2022 01:17
@andrewvc andrewvc added v8.5.0 and removed v8.5.0 labels Oct 4, 2022
Copy link
Collaborator

@emilioalvap emilioalvap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the strict by default discussion, it would seem the actual config that drives chain behaviour is InsecureSkipVerify, not verification_mode itself. Could we add this as a default for all https monitors? It would still make possible to get the correct result with full verification mode.


// If this chain expires sooner than a previously seen chain we don't
// set any fields
if chainNotAfter.Before(maxNotAfterFloor) {
Copy link
Collaborator

@emilioalvap emilioalvap Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxNotAfterFloor is not updated in the loop, so all chains go through overriding each other. Is that expected? Shouldn't this be initialiazed to hostCert.NotAfter if available? Edit: nvm, chain expiration can pre-date peer certificate expiration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will fix

@mergify
Copy link
Contributor

mergify bot commented Oct 10, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fix-cert-chains upstream/fix-cert-chains
git merge upstream/main
git push upstream fix-cert-chains

@andrewvc
Copy link
Contributor Author

@emilioalvap this is now ready for review again, I spent probably a bit too much effort improving the logic and test cases in aa35bb0 , while I think it could be cleaned up further, given that this requires strict, I think it covers more than can probably be justified given the effort of digging up the right certs.

@paulb-elastic
Copy link
Contributor

Taking the PR off the board as the parent issue is on there

Copy link
Collaborator

@emilioalvap emilioalvap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, reviewed test scenarios locally and it holds up. Still some linting to fix, though

@andrewvc andrewvc merged commit bd081b6 into elastic:main Oct 11, 2022
@andrewvc andrewvc deleted the fix-cert-chains branch October 11, 2022 20:56
@andrewvc
Copy link
Contributor Author

@Mergifyio backport 8.5

@mergify
Copy link
Contributor

mergify bot commented Oct 11, 2022

backport 8.5

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Oct 11, 2022
* [Heartbeat] Fix expiration of cert chains calculation

Fixes #33215

This is a bit WIP in that I'm not certain this is the final logic we want. It appears the elastic agent libs override the default golang TLS implementation's verification and only set the cert chains attribute on the connection when strict validation is enabled, where the default is `full`. See https://www.elastic.co/guide/en/beats/heartbeat/current/configuration-ssl.html for definition of the various validation options.

I suspect we'd be best off making strict the default since this is what chrome does nowadays, requiring a SAN to be set, which is the only other distinction.

To test this use the following config, which returns a too-early expiration date without this patch:

```
- type: http
  enabled: true
  id: wyportal
  name: WYPortal
  schedule: "@every 1m"
  ssl.verification_mode: strict
  urls: "https://portals.edu.wyoming.gov/"
```

You can try playing with the verification mode as well

* Cleanup work

* Add changelog

* Fix cert calculations

* Fix lint

(cherry picked from commit bd081b6)
@andrewvc
Copy link
Contributor Author

@Mergifyio backport 7.17

mergify bot pushed a commit that referenced this pull request Oct 11, 2022
* [Heartbeat] Fix expiration of cert chains calculation

Fixes #33215

This is a bit WIP in that I'm not certain this is the final logic we want. It appears the elastic agent libs override the default golang TLS implementation's verification and only set the cert chains attribute on the connection when strict validation is enabled, where the default is `full`. See https://www.elastic.co/guide/en/beats/heartbeat/current/configuration-ssl.html for definition of the various validation options.

I suspect we'd be best off making strict the default since this is what chrome does nowadays, requiring a SAN to be set, which is the only other distinction.

To test this use the following config, which returns a too-early expiration date without this patch:

```
- type: http
  enabled: true
  id: wyportal
  name: WYPortal
  schedule: "@every 1m"
  ssl.verification_mode: strict
  urls: "https://portals.edu.wyoming.gov/"
```

You can try playing with the verification mode as well

* Cleanup work

* Add changelog

* Fix cert calculations

* Fix lint

(cherry picked from commit bd081b6)

# Conflicts:
#	heartbeat/hbtest/hbtestutil.go
#	heartbeat/monitors/active/dialchain/tlsmeta/tlsmeta.go
#	heartbeat/monitors/active/dialchain/tlsmeta/tlsmeta_test.go
#	heartbeat/monitors/active/http/task.go
#	heartbeat/monitors/active/tcp/tcp.go
@mergify
Copy link
Contributor

mergify bot commented Oct 11, 2022

backport 7.17

✅ Backports have been created

andrewvc added a commit that referenced this pull request Oct 12, 2022
* [Heartbeat] Fix expiration of cert chains calculation

Fixes #33215

This is a bit WIP in that I'm not certain this is the final logic we want. It appears the elastic agent libs override the default golang TLS implementation's verification and only set the cert chains attribute on the connection when strict validation is enabled, where the default is `full`. See https://www.elastic.co/guide/en/beats/heartbeat/current/configuration-ssl.html for definition of the various validation options.

I suspect we'd be best off making strict the default since this is what chrome does nowadays, requiring a SAN to be set, which is the only other distinction.

To test this use the following config, which returns a too-early expiration date without this patch:

```
- type: http
  enabled: true
  id: wyportal
  name: WYPortal
  schedule: "@every 1m"
  ssl.verification_mode: strict
  urls: "https://portals.edu.wyoming.gov/"
```

You can try playing with the verification mode as well

* Cleanup work

* Add changelog

* Fix cert calculations

* Fix lint

(cherry picked from commit bd081b6)

Co-authored-by: Andrew Cholakian <[email protected]>
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
* [Heartbeat] Fix expiration of cert chains calculation

Fixes #33215

This is a bit WIP in that I'm not certain this is the final logic we want. It appears the elastic agent libs override the default golang TLS implementation's verification and only set the cert chains attribute on the connection when strict validation is enabled, where the default is `full`. See https://www.elastic.co/guide/en/beats/heartbeat/current/configuration-ssl.html for definition of the various validation options.

I suspect we'd be best off making strict the default since this is what chrome does nowadays, requiring a SAN to be set, which is the only other distinction.

To test this use the following config, which returns a too-early expiration date without this patch:

```
- type: http
  enabled: true
  id: wyportal
  name: WYPortal
  schedule: "@every 1m"
  ssl.verification_mode: strict
  urls: "https://portals.edu.wyoming.gov/"
```

You can try playing with the verification mode as well

* Cleanup work

* Add changelog

* Fix cert calculations

* Fix lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Heartbeat] Properly handle cross-signed CAs
4 participants