-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Database CA #9593
Create Database CA #9593
Conversation
903fbd8
to
465c64a
Compare
processEvent(ctx context.Context, e types.Event) error | ||
// watchKind returns a watch | ||
// required for this collection | ||
watchKind() types.WatchKind | ||
// erase erases all data in the collection | ||
erase(ctx context.Context) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that not the only method that was not used. I didn't want to clean up all of them as this is not the goal of this PR.
// Copy the Host CA as Database CA. | ||
log.Infof("Migrating Database CA") | ||
|
||
cav2, ok := hostCA.(*types.CertAuthorityV2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of copying the entire resource, how about we just make a new resource for the database CA and populate TLS key pairs appropriately? This way we don't end up with unnecessary fields in the new CA (e.g. SSH keys).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
lib/auth/init.go
Outdated
err = asrv.Trust.CreateCertAuthority(cav2) | ||
if err != nil { | ||
return trace.Wrap(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually recommend scaling auth servers to one before upgrading but not sure how many people actually do that. So there is a possibility of a race if multiple auth servers are attempting the migration concurrently - one will create the CA here and then the other one will get an error. I would handle AlreadyExists error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
lib/auth/db.go
Outdated
databaseCA, err := s.GetCertAuthority(types.CertAuthID{ | ||
Type: types.DatabaseCA, | ||
DomainName: clusterName.GetClusterName(), | ||
}, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a fallback to HostCA here. In general, the proper upgrade procedure is to upgrade auth servers first but in case there's a newer database agent connects to an older auth server, this will fail. It seems easy enough to do a fallback here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
@@ -1020,6 +1028,57 @@ func migrateCertAuthorities(ctx context.Context, asrv *Server) error { | |||
return nil | |||
} | |||
|
|||
// migrateDBAuthority copies Host CA as Database CA. Before v9.0 database access was using host CA to sign all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking more about this and now pondering whether this is a good approach from the security standpoint. The scenario I've in my mind is this: say your host CA gets compromised, you rotate it but aren't aware that your db CA after this migration is the same and that you need to rotate it too. In a situation like this, you would probably rotate all your CAs anyway but I wonder if this may not be a desired outcome following the principle of least surprise.
@fspmarshall @russjones Do you guys have any thoughts about this migration approach versus just leaving database CA empty and falling back to host CA and letting users explicitly initialize their database CA by performing rotation? We considered both and chose this one for several reasons (e.g. we'd likely need to update rotation logic to support empty CAs for the other one, instruct users to perform rotation after upgrade if they want to migrate to separate database CA, etc.) but now wondering if this might not be the best way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r0mant I find the scenario you described pretty convincing. If a user wants a CA gone, we should protect them from accidentally leaving it in circulation like that.
If updating rotation logic to support empty CAs is tricky, what about updating it to check for duplicate CAs and rotate all CAs that match the targeted CA? E.g. if I try to rotate my host CA, and teleport notices that the DB ca happens to be equivalent, could it just automatically rotate it in lockstep? Injecting an additional CA type here would be fairly straightforward.
While doing that kind of thing silently is generally sketchy, I think we can safely assume that we won't have matching CAs by chance, meaning that it will basically always be a case like this one; we've split the duties that one CA used to perform into two or more separate CAs, and are now rotating for the first time since that split.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. @jakule Could you please take a look which approach looks more straightforward to implement (supporting empty CA vs rotating all matching CAs in lockstep) and let's do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r0mant I'll look into that. Right now I'm leaning towards rotating all matching certificates. Fallback is probably simpler to implement, but if we decide to go this route, I don't think we will be able to remove that logic anytime soon, as we don't know when everyone decide to rotate their certs. Rotating all matching certs keeps the logic in one place and allows us to reuse it if we ever need it (Desktop Access for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tctl auth export
command handler also need to updated to handle DbCA : https://github.com/gravitational/teleport/blob/jakule/database-ca/tool/tctl/common/auth_command.go#L178
https://goteleport.com/docs/setup/reference/cli/#tctl-auth-export
Also the DBCa seems to be missing in updating trusted cluster flow:
teleport/lib/auth/trustedcluster.go
Line 519 in 9ac75d0
for _, caType := range []types.CertAuthType{types.HostCA, types.UserCA} { |
lib/auth/init.go
Outdated
@@ -313,8 +313,14 @@ func Init(cfg InitConfig, opts ...ServerOption) (*Server, error) { | |||
} | |||
log.Infof("Created namespace: %q.", apidefaults.Namespace) | |||
|
|||
// Migrate Host CA as Database CA before certificates generation. Otherwise, the Database CA will be | |||
// generated which we don't want for existing installations. | |||
if err := migrateDBAuthority(asrv); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that for this func should be called after migrateLegacyResources is invoked or migrateLegacyResources
should be updated to take into account DB CA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about it, but if a CA is missing we will create it here:
Line 323 in 9ac75d0
for _, caType := range types.CertAuthTypes { |
Then there is nothing to "migrate" in
migrateLegaceResources
as DB CA has been already created. That's why I created a new function and I call it before we create any new cert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming all of Roman's comments are addressed.
9ac75d0
to
841480a
Compare
@smallinsky Thanks. Good catch. I missed that. |
lib/auth/db.go
Outdated
@@ -89,7 +99,7 @@ func getServerNames(req *proto.DatabaseCertRequest) []string { | |||
|
|||
// SignDatabaseCSR generates a client certificate used by proxy when talking | |||
// to a remote database service. | |||
func (s *Server) SignDatabaseCSR(ctx context.Context, req *proto.DatabaseCSRRequest) (*proto.DatabaseCSRResponse, error) { | |||
func (s *Server) SignDatabaseCSR(_ context.Context, req *proto.DatabaseCSRRequest) (*proto.DatabaseCSRResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakule I have just noticed that we're still using HostCA here. I think it should be updated as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r0mant
The HostCA flag in scope of SignDatabaseCSR method is used to obtain HostCA certificate and add HostCA to the trusted pool were will used during reverse tunnel connection [teleport db proxy <--> db service ] mTLS handshake. DB service terminates the incoming connection with a server identity that was generated by HostCA during RegisterUsingToken (or LocalRegister if DB service is running with auth service) call.
The connection established with a cert generated by the SignDatabaseCSR
method is teleport "internal component" connection between teleport proxy and DB agent thus do we really want to introduce different server identity generation for RoleDatabase
SystemRole for DB service and sign RoleDatabase identity with DatabaseCA ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smallinsky @jakule So there are two aspects to it I think:
-
There's an agent (process) identity that gets generated when a new agent starts up. Currently it is signed by the Host CA. I think that ideally each agent's identity would also be signed by their respective CA (i.e. SSH nodes - by Host CA, database agents - by Database CA, desktop agents - by Desktop CA, etc.) but that would likely be a pretty large change and it's a little orthogonal to what this PR does. So I propose to leave agent identity as it is for now.
-
There's a certificate that's generated by this SignDatabaseCSR method that as you guys mentioned is used to transfer identity information over reverse tunnel to the database agent. I think it should use Database CA instead of Host CA as I suggested here. It makes sense that when communicating with a database agent the proxy would use the Database CA to sign those certs. This way one compromised CA does not affect others. It should be pretty easy to do if we make sure that database agent loads both Host CA and Database CA in its client cert pool.
One thing to keep in mind is that we should maintain backwards compatibility to avoid situation where a new Proxy signs a certificate with Database CA which older database agent can't verify (our upgrade instructions say to upgrade Proxies before agents). To make sure compatibility is not broken, Proxy can check the version of the agent it's connecting to (Teleport version should be available as a part of DatabaseServer resource) and sign the cert with Host CA if it's talking to an older agent (e.g. < 9.x) and with Database CA otherwise. Then in Teleport 10 we will remove Host CA from database agents' client pools.
WDYGT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Agree Connecting to an DB agent service with certs generated by DatabaseCA based on the agent version where DatabaseCA will be added to the DB agent poll sounds reasonable and should be pretty easy to implement.
Though without changing agent (process) identity signer to DatabaseCA
we will still be using HostCA
pool on the proxy side to verify cert presented by DB agent during reverse tunnel connection. But changing this might be more complicated ( for example identify is generated once and cached on local agent backend) and probably should be kept out of scope for this PR.
d3c0c4f
to
9bace68
Compare
76a45f1
to
3bc32e8
Compare
Remove SSH key form Database CA
@smallinsky No, those changes were not intended. I think I did something unusual when resolving the latest git merge conflicts. I reverted them. Thanks for pointing that out. |
Co-authored-by: Marek Smoliński <[email protected]>
|
||
// findDuplicatedCertificates checks if any CA provided as caTypes has a duplicate in allCerts. List of all duplicates | ||
// is returned as a result. If no duplicates are found, then caTypes is returned in unmodified form. | ||
func findDuplicatedCertificates(caTypes []types.CertAuthType, allCerts CertAuthorityMap) []types.CertAuthType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we found a match here, I think we should print a warning for the user that their database CA will be rotated alongside host CA (and vice versa). Otherwise they might not expect it and it will break their database access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
lib/srv/db/proxyserver.go
Outdated
ClusterName: identity.RouteToCluster, | ||
CSR: csr, | ||
ClusterName: identity.RouteToCluster, | ||
SignWithDatabaseCA: teleportVer.LessThan(*semver.New("9.1.0")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Move version to a constant. And also add a TODO item to remove this in Teleport 11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
lib/srv/db/proxyserver.go
Outdated
@@ -660,7 +668,7 @@ func getConfigForClient(conf *tls.Config, ap auth.ReadDatabaseAccessPoint, log l | |||
log.Debugf("Ignoring unsupported cluster name %q.", info.ServerName) | |||
} | |||
} | |||
pool, err := auth.ClientCertPool(ap, clusterName) | |||
pool, err := auth.ClientCertPool(ap, clusterName, types.HostCA, types.DatabaseCA, types.UserCA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to load all CAs here? Client certificates are signed by user CA, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I refactored the function I wanted to keep the same sets of certs as before, but I agree. We need only UserCA
. Changed
Fix "remove in" version
@jakule @smallinsky Can we rebase and merge this? |
* Add DatabaseCA to trusted cluster rotation * Add DatabaseCA test * Refactor NewCertAuthorityWatcher - remove code duplication. * Refactor TestDatabaseRotateTrustedCluster. * Add comments to a DatabaseCA test. * Improve database rotation test. * Clean up test Add comments * Rename variable * Fix logger in DB integration test.
This PR introduces a new Certificate Authority type: Database CA.
Closes #5029