Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Database CA #9593

Merged
merged 55 commits into from
Apr 5, 2022
Merged

Create Database CA #9593

merged 55 commits into from
Apr 5, 2022

Conversation

jakule
Copy link
Contributor

@jakule jakule commented Dec 29, 2021

This PR introduces a new Certificate Authority type: Database CA.

  1. New CA is introduced for signing certificates used in Database access.
  2. New migration has been added to migrated existing HostCA (which is currently used for signing DB certs) as a Database CA. This will allow to upgrade existing clusters to v9.
  3. For new installations Database CA will be generated in the same way as other certificates on the first start up.

Closes #5029

@jakule jakule force-pushed the jakule/database-ca branch 2 times, most recently from 903fbd8 to 465c64a Compare January 6, 2022 22:48
@jakule jakule marked this pull request as ready for review January 7, 2022 04:11
@jakule jakule requested a review from r0mant January 7, 2022 04:11
@github-actions github-actions bot requested review from greedy52 and zmb3 January 7, 2022 04:11
@github-actions github-actions bot added database-access Database access related issues and PRs tctl tctl - Teleport admin tool labels Jan 7, 2022
@r0mant r0mant requested a review from smallinsky January 12, 2022 01:17
processEvent(ctx context.Context, e types.Event) error
// watchKind returns a watch
// required for this collection
watchKind() types.WatchKind
// erase erases all data in the collection
erase(ctx context.Context) error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that not the only method that was not used. I didn't want to clean up all of them as this is not the goal of this PR.

// Copy the Host CA as Database CA.
log.Infof("Migrating Database CA")

cav2, ok := hostCA.(*types.CertAuthorityV2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of copying the entire resource, how about we just make a new resource for the database CA and populate TLS key pairs appropriately? This way we don't end up with unnecessary fields in the new CA (e.g. SSH keys).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

lib/auth/init.go Outdated
Comment on lines 1074 to 1085
err = asrv.Trust.CreateCertAuthority(cav2)
if err != nil {
return trace.Wrap(err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually recommend scaling auth servers to one before upgrading but not sure how many people actually do that. So there is a possibility of a race if multiple auth servers are attempting the migration concurrently - one will create the CA here and then the other one will get an error. I would handle AlreadyExists error here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

lib/auth/db.go Outdated
Comment on lines 46 to 49
databaseCA, err := s.GetCertAuthority(types.CertAuthID{
Type: types.DatabaseCA,
DomainName: clusterName.GetClusterName(),
}, true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a fallback to HostCA here. In general, the proper upgrade procedure is to upgrade auth servers first but in case there's a newer database agent connects to an older auth server, this will fail. It seems easy enough to do a fallback here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@@ -1020,6 +1028,57 @@ func migrateCertAuthorities(ctx context.Context, asrv *Server) error {
return nil
}

// migrateDBAuthority copies Host CA as Database CA. Before v9.0 database access was using host CA to sign all
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking more about this and now pondering whether this is a good approach from the security standpoint. The scenario I've in my mind is this: say your host CA gets compromised, you rotate it but aren't aware that your db CA after this migration is the same and that you need to rotate it too. In a situation like this, you would probably rotate all your CAs anyway but I wonder if this may not be a desired outcome following the principle of least surprise.

@fspmarshall @russjones Do you guys have any thoughts about this migration approach versus just leaving database CA empty and falling back to host CA and letting users explicitly initialize their database CA by performing rotation? We considered both and chose this one for several reasons (e.g. we'd likely need to update rotation logic to support empty CAs for the other one, instruct users to perform rotation after upgrade if they want to migrate to separate database CA, etc.) but now wondering if this might not be the best way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r0mant I find the scenario you described pretty convincing. If a user wants a CA gone, we should protect them from accidentally leaving it in circulation like that.

If updating rotation logic to support empty CAs is tricky, what about updating it to check for duplicate CAs and rotate all CAs that match the targeted CA? E.g. if I try to rotate my host CA, and teleport notices that the DB ca happens to be equivalent, could it just automatically rotate it in lockstep? Injecting an additional CA type here would be fairly straightforward.

While doing that kind of thing silently is generally sketchy, I think we can safely assume that we won't have matching CAs by chance, meaning that it will basically always be a case like this one; we've split the duties that one CA used to perform into two or more separate CAs, and are now rotating for the first time since that split.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. @jakule Could you please take a look which approach looks more straightforward to implement (supporting empty CA vs rotating all matching CAs in lockstep) and let's do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r0mant I'll look into that. Right now I'm leaning towards rotating all matching certificates. Fallback is probably simpler to implement, but if we decide to go this route, I don't think we will be able to remove that logic anytime soon, as we don't know when everyone decide to rotate their certs. Rotating all matching certs keeps the logic in one place and allows us to reuse it if we ever need it (Desktop Access for example).

Copy link
Contributor

@smallinsky smallinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tctl auth export command handler also need to updated to handle DbCA : https://github.com/gravitational/teleport/blob/jakule/database-ca/tool/tctl/common/auth_command.go#L178
https://goteleport.com/docs/setup/reference/cli/#tctl-auth-export

Also the DBCa seems to be missing in updating trusted cluster flow:

for _, caType := range []types.CertAuthType{types.HostCA, types.UserCA} {

lib/auth/init.go Outdated
@@ -313,8 +313,14 @@ func Init(cfg InitConfig, opts ...ServerOption) (*Server, error) {
}
log.Infof("Created namespace: %q.", apidefaults.Namespace)

// Migrate Host CA as Database CA before certificates generation. Otherwise, the Database CA will be
// generated which we don't want for existing installations.
if err := migrateDBAuthority(asrv); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for this func should be called after migrateLegacyResources is invoked or migrateLegacyResources should be updated to take into account DB CA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it, but if a CA is missing we will create it here:

for _, caType := range types.CertAuthTypes {

Then there is nothing to "migrate" in migrateLegaceResources as DB CA has been already created. That's why I created a new function and I call it before we create any new cert.

Copy link
Collaborator

@zmb3 zmb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming all of Roman's comments are addressed.

@jakule jakule force-pushed the jakule/database-ca branch from 9ac75d0 to 841480a Compare January 26, 2022 02:33
@jakule
Copy link
Contributor Author

jakule commented Jan 26, 2022

tctl auth export command handler also need to updated to handle DbCA : https://github.com/gravitational/teleport/blob/jakule/database-ca/tool/tctl/common/auth_command.go#L178 https://goteleport.com/docs/setup/reference/cli/#tctl-auth-export

Also the DBCa seems to be missing in updating trusted cluster flow:

for _, caType := range []types.CertAuthType{types.HostCA, types.UserCA} {

@smallinsky Thanks. Good catch. I missed that.

lib/auth/db.go Outdated
@@ -89,7 +99,7 @@ func getServerNames(req *proto.DatabaseCertRequest) []string {

// SignDatabaseCSR generates a client certificate used by proxy when talking
// to a remote database service.
func (s *Server) SignDatabaseCSR(ctx context.Context, req *proto.DatabaseCSRRequest) (*proto.DatabaseCSRResponse, error) {
func (s *Server) SignDatabaseCSR(_ context.Context, req *proto.DatabaseCSRRequest) (*proto.DatabaseCSRResponse, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakule I have just noticed that we're still using HostCA here. I think it should be updated as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r0mant
The HostCA flag in scope of SignDatabaseCSR method is used to obtain HostCA certificate and add HostCA to the trusted pool were will used during reverse tunnel connection [teleport db proxy <--> db service ] mTLS handshake. DB service terminates the incoming connection with a server identity that was generated by HostCA during RegisterUsingToken (or LocalRegister if DB service is running with auth service) call.

The connection established with a cert generated by the SignDatabaseCSR method is teleport "internal component" connection between teleport proxy and DB agent thus do we really want to introduce different server identity generation for RoleDatabase SystemRole for DB service and sign RoleDatabase identity with DatabaseCA ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smallinsky @jakule So there are two aspects to it I think:

  • There's an agent (process) identity that gets generated when a new agent starts up. Currently it is signed by the Host CA. I think that ideally each agent's identity would also be signed by their respective CA (i.e. SSH nodes - by Host CA, database agents - by Database CA, desktop agents - by Desktop CA, etc.) but that would likely be a pretty large change and it's a little orthogonal to what this PR does. So I propose to leave agent identity as it is for now.

  • There's a certificate that's generated by this SignDatabaseCSR method that as you guys mentioned is used to transfer identity information over reverse tunnel to the database agent. I think it should use Database CA instead of Host CA as I suggested here. It makes sense that when communicating with a database agent the proxy would use the Database CA to sign those certs. This way one compromised CA does not affect others. It should be pretty easy to do if we make sure that database agent loads both Host CA and Database CA in its client cert pool.

One thing to keep in mind is that we should maintain backwards compatibility to avoid situation where a new Proxy signs a certificate with Database CA which older database agent can't verify (our upgrade instructions say to upgrade Proxies before agents). To make sure compatibility is not broken, Proxy can check the version of the agent it's connecting to (Teleport version should be available as a part of DatabaseServer resource) and sign the cert with Host CA if it's talking to an older agent (e.g. < 9.x) and with Database CA otherwise. Then in Teleport 10 we will remove Host CA from database agents' client pools.

WDYGT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Agree Connecting to an DB agent service with certs generated by DatabaseCA based on the agent version where DatabaseCA will be added to the DB agent poll sounds reasonable and should be pretty easy to implement.
Though without changing agent (process) identity signer to DatabaseCA we will still be using HostCA pool on the proxy side to verify cert presented by DB agent during reverse tunnel connection. But changing this might be more complicated ( for example identify is generated once and cached on local agent backend) and probably should be kept out of scope for this PR.

@jakule jakule force-pushed the jakule/database-ca branch 2 times, most recently from d3c0c4f to 9bace68 Compare February 1, 2022 18:34
@jakule jakule force-pushed the jakule/database-ca branch 2 times, most recently from 76a45f1 to 3bc32e8 Compare February 17, 2022 02:48
@jakule jakule requested review from r0mant and smallinsky February 17, 2022 22:07
@jakule
Copy link
Contributor Author

jakule commented Mar 18, 2022

I can see that PR updates the webassets and e submodule that was intended ?

@smallinsky No, those changes were not intended. I think I did something unusual when resolving the latest git merge conflicts. I reverted them. Thanks for pointing that out.

jakule and others added 2 commits March 18, 2022 12:16

// findDuplicatedCertificates checks if any CA provided as caTypes has a duplicate in allCerts. List of all duplicates
// is returned as a result. If no duplicates are found, then caTypes is returned in unmodified form.
func findDuplicatedCertificates(caTypes []types.CertAuthType, allCerts CertAuthorityMap) []types.CertAuthType {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we found a match here, I think we should print a warning for the user that their database CA will be rotated alongside host CA (and vice versa). Otherwise they might not expect it and it will break their database access.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

ClusterName: identity.RouteToCluster,
CSR: csr,
ClusterName: identity.RouteToCluster,
SignWithDatabaseCA: teleportVer.LessThan(*semver.New("9.1.0")),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Move version to a constant. And also add a TODO item to remove this in Teleport 11.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@@ -660,7 +668,7 @@ func getConfigForClient(conf *tls.Config, ap auth.ReadDatabaseAccessPoint, log l
log.Debugf("Ignoring unsupported cluster name %q.", info.ServerName)
}
}
pool, err := auth.ClientCertPool(ap, clusterName)
pool, err := auth.ClientCertPool(ap, clusterName, types.HostCA, types.DatabaseCA, types.UserCA)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to load all CAs here? Client certificates are signed by user CA, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I refactored the function I wanted to keep the same sets of certs as before, but I agree. We need only UserCA. Changed

@r0mant
Copy link
Collaborator

r0mant commented Mar 30, 2022

@jakule @smallinsky Can we rebase and merge this?

jakule added 4 commits April 4, 2022 11:59
* Add DatabaseCA to trusted cluster rotation

* Add DatabaseCA test

* Refactor NewCertAuthorityWatcher - remove code duplication.

* Refactor TestDatabaseRotateTrustedCluster.

* Add comments to a DatabaseCA test.

* Improve database rotation test.

* Clean up test
Add comments

* Rename variable

* Fix logger in DB integration test.
@jakule jakule enabled auto-merge (squash) April 5, 2022 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database-access Database access related issues and PRs tctl tctl - Teleport admin tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decouple database CA from host CA
5 participants