-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teleport 15 Test Plan #36663
Comments
Desktop Access @probakowski @ibeckermayer
Binaries / OS compatibility @fheineckeVerify that our software runs on the minimum supported OS versions as per Windows @ravicious
macOS @camscale
Linux @camscale
Machine ID @strideynet
With an SSH node registered to the Teleport cluster:
With a Postgres DB registered to the Teleport cluster:
With a Kubernetes cluster registered to the Teleport cluster:
With a HTTP application registered to the Teleport cluster:
Host users creation @lxeaHost users creation docs
CA rotations @fspmarshall
Proxy Peering
EC2 Discovery @marcoandredinis
Azure Discovery @marcoandredinis
GCP Discovery @atburke
IP Pinning @AntonAMAdd a role with
Assist @jakule @ryanclark @tigrato @xacrimon @justinasAssist is not supported by
Resources |
Certificate presented by root proxy for leaf agentless nodes is not trusted by client #36801 |
No audit log entries when SCP denied: #36820 |
Desktop session clipboard and directory sharing icon state is unclear: #36825 |
Login failure events aren't emitted if MFA is enabled: #36837 |
#31410, which I reported for v14, is still not fixed |
tsh does not work on Windows 10 rev. 1607. This is a regression introduced somewhere between 13.0.0 and 13.4.15. [Alan]: Fixed by #36859. |
agent locking is broken: https://github.com/gravitational/teleport-private/issues/1340 |
A minor issue with tsh in FIPS mode (not sure if related to FIPS itself, though): #36922 |
Unable to use RDP in FIPS mode: #36928 |
Nit: |
Helm chart |
Jump host fails with unknown certificate authority: #36964 |
default |
Performance Test ResultsCloudLoad TestsSoak TestsOrigin: us-east-1 Target: us-east-1/usr/local/bin/tsh bench ssh --duration=30m root@node-agents-67588c8d58-26f2m-00 ls
* Requests originated: 17998
* Requests failed: 0
Histogram
Percentile Response Duration
---------- -----------------
25 219 ms
50 224 ms
75 229 ms
90 234 ms
95 236 ms
99 245 ms
100 1228 ms
/usr/local/bin/tsh bench ssh --duration=30m root@fullname=node-agents-67588c8d58-26f2m-00 ls
* Requests originated: 17996
* Requests failed: 0
Histogram
Percentile Response Duration
---------- -----------------
25 481 ms
50 488 ms
75 495 ms
90 501 ms
95 505 ms
99 514 ms
100 1522 ms
/usr/local/bin/tsh bench ssh --duration=30m --random root@all ls
* Requests originated: 17998
* Requests failed: 0
Histogram
Percentile Response Duration
---------- -----------------
25 218 ms
50 224 ms
75 232 ms
90 242 ms
95 251 ms
99 289 ms
100 784 ms
Origin: us-east-1 Target: us-west-2/usr/local/bin/tsh bench ssh --duration=30m root@ip-172-31-35-106 ls
* Requests originated: 17993
* Requests failed: 0
Histogram
Percentile Response Duration
---------- -----------------
25 783 ms
50 801 ms
75 827 ms
90 850 ms
95 858 ms
99 878 ms
100 1102 ms
Origin: us-west-2 Target: us-east-1/usr/local/bin/tsh bench ssh --duration=30m root@node-agents-67588c8d58-26f2m-00 ls
* Requests originated: 17992
* Requests failed: 0
Histogram
Percentile Response Duration
---------- -----------------
25 823 ms
50 836 ms
75 845 ms
90 851 ms
95 858 ms
99 870 ms
100 1980 ms
etcd1Postgres1Note The postgres backend exhibited some odd memory usage behaviors that were not observed when testing other backends. Firestore1Footnotes |
IBM changed its admin etcd login process, docs are not working anymore: #37059 |
Cancelling running query doesn't work for CockroachDB #37074 |
Found a seemingly Cloud-specific issue with inviting new users to a cluster. #37159 |
Okta integration installer doesn't create SSO connector: #37160 |
Goroutine leak on PostgreSQL database access: #37219 |
Database Access load test (PostgreSQL and MySQL)Setup (same as previous test)EKS with a single node group:
Teleport cluster (all deployed on the EKS cluster):
Databases:
Note: Databases were configured using discovery running inside the database agent.
MySQL10 connections/second
50 connections/second
PostgreSQL10 connections/second
50 connections/second
|
SSH Connection ResumptionVerify that SSH works, and that resumable SSH is not interrupted across a Teleport Cloud tenant upgrade.
(resumed connections to peered nodes work with a local tsh after #37352) Verify that SSH works, and that resumable SSH is not interrupted across a control plane restart (of either the root or the leaf cluster).
|
The "SSH gRPC" transport client code doesn't unblock the connection on |
Manual Testing Plan
Below are the items that should be manually tested with each release of Teleport.
These tests should be run on both a fresh installation of the version to be released
as well as an upgrade of the previous version of Teleport.
Adding nodes to a cluster @lxea
Labels @lxea
Trusted Clusters @bl-nero
RBAC @bl-nero
Make sure that invalid and valid attempts are reflected in audit log. Do this with both Teleport and Agentless nodes.
Verify that custom PAM environment variables are available as expected. @atburke
Users @codingllama
With every user combination, try to login and signup with invalid second
factor, invalid password to see how the system reacts.
WebAuthn in the release
tsh
binary is implemented using libfido2 forlinux/macOS. Ask for a statically built pre-release binary for realistic
tests. (
tsh fido2 diag
should work in our binary.) Webauthn in Windowsbuild is implemented using
webauthn.dll
. (tsh webauthn diag
withsecurity key selected in dialog should work.)
Touch ID requires a signed
tsh
, ask for a signed pre-release binary so youmay run the tests.
Windows Webauthn requires Windows 10 19H1 and device capable of Windows
Hello.
Adding Users Password Only
Adding Users OTP
Adding Users WebAuthn
Adding Users via platform authenticator
Managing MFA devices
tsh mfa add
tsh mfa add
tsh mfa add
tsh mfa ls
tsh mfa rm
tsh mfa rm
second_factor: on
inauth_service
, should failsecond_factor: optional
inauth_service
, should succeedLogin Password Only
Login with MFA
tsh mfa add
Login OIDC
Login SAML
Login GitHub
Deleting Users
Backends @rosstimothy
Session Recording @atburke
Enhanced Session Recording @jakule
disk
,command
andnetwork
events are being logged.enhanced_recording
role option.Auditd @jakule
teleport/lib/auditd/common.go
Lines 25 to 34 in 7744f72
Audit Log @rosstimothy
Audit log with dynamodb
Audit log with Firestore
Failed login attempts are recorded
Interactive sessions have the correct Server ID
server_id
is the ID of the node in "session_recording: node" modeserver_id
is the ID of the node in "session_recording: proxy" modeforwarded_by
is the ID of the proxy in "session_recording: proxy" modeNode/Proxy ID may be found at
/var/lib/teleport/host_uuid
in thecorresponding machine.
Node IDs may also be queried via
tctl nodes ls
.Exec commands are recorded
scp
commands are recordedSubsystem results are recorded
Subsystem testing may be achieved using both
Recording Proxy mode
and
OpenSSH integration.
Assuming the proxy is
proxy.example.com:3023
andnode1
is a node runningOpenSSH/sshd, you may use the following command to trigger a subsystem audit
log:
sftp -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %[email protected] -s proxy:%h:%p" root@node1
External Audit Storage @nklaassen
External Audit Storage must be tested on an Enterprise Cloud tenant.
Instructions for deploying a custom release to a cloud staging tenant: https://github.com/gravitational/teleport.e/blob/master/dev-deploy.md
tsh play <session-id>
worksInteract with a cluster using
tsh
@capnspacehookThese commands should ideally be tested for recording and non-recording modes as they are implemented in a different ways.
Interact with a cluster using
ssh
@strideynetMake sure to test both recording and regular proxy modes.
Verify proxy jump functionality @atburke Jump host fails with unknown certificate authority #36964
Log into leaf cluster via root, shut down the root proxy and verify proxy jump works.
Interact with a cluster using the Web UI @capnspacehook
X11 Forwarding @marcoandredinis
xeyes
andxclip
:apt install x11-apps xclip
xeyes
. Thenbrew install xclip
.ssh_service.x11.enabled = yes
tsh ssh -X user@node xeyes
tsh ssh -X root@node xeyes
tsh ssh -Y server01 "echo Hello World | xclip -sel c && xclip -sel c -o"
should print "Hello World"tsh ssh -X server01 "echo Hello World | xclip -sel c && xclip -sel c -o"
should fail with "BadAccess" X errorUser accounting @atburke
/var/run/utmp
on Linux./var/log/wtmp
on Linux.Combinations @capnspacehook
For some manual testing, many combinations need to be tested. For example, for
interactive sessions the 12 combinations are below.
Add an agentless Node in a local cluster.
Add a Teleport Node in a local cluster.
Add an agentless Node in a remote (leaf) cluster.
Add a Teleport Node in a remote (leaf) cluster.
Teleport with EKS/GKE @AntonAM
Teleport with multiple Kubernetes clusters @tigrato
Note: you can use GKE or EKS or minikube to run Kubernetes clusters.
Minikube is the only caveat - it's not reachable publicly so don't run a proxy there.
tsh login
, check thattsh kube ls
has your clusterkubectl get nodes
,kubectl exec -it $SOME_POD -- sh
tsh login
, check thattsh kube ls
has your clusterkubectl get nodes
,kubectl exec -it $SOME_POD -- sh
tsh login
, check thattsh kube ls
has your clusterkubectl get nodes
,kubectl exec -it $SOME_POD -- sh
tsh login
, check thattsh kube ls
has both clusterstsh kube login
kubectl get nodes
,kubectl exec -it $SOME_POD -- sh
on the new clustertsh login
, check thattsh kube ls
has all clustersname
andlabels
Step 2
login value matching the rowsname
columnname
orlabels
in the search bar worksname
columKubernetes auto-discovery @AntonAM
tctl create
.tctl create -f
.tctl rm
.Kubernetes Secret Storage @AntonAM
Statefulset
Kubernetes Pod RBAC @AntonAM
kubernetes_resources
:{"kind":"pod","name":"*","namespace":"*"}
- must allow access to every pod.{"kind":"pod","name":"<somename>","namespace":"*"}
- must allow access to pod<somename>
in every namespace.{"kind":"pod","name":"*","namespace":"<somenamespace>"}
- must allow access to any pod in<somenamespace>
namespace.*
wildcards -<some-name>-*
and regex forname
andnamespace
fields.go-client
.kubernetes_resources
:kubernetes_groups
that denies exec into a podsearch_as_roles
is not allowed.Teleport with FIPS mode @bl-nero
ACME @bl-nero
Migrations @tigrato
SSH should work for both main and old clusters
SSH should work
Command Templates
When interacting with a cluster, the following command templates are useful:
OpenSSH
Teleport
Teleport with SSO Providers
GitHub External SSO @capnspacehook
tctl sso
family of commands @flyinghermitFor help with setting up sso connectors, check out the [Quick GitHub/SAML/OIDC Setup Tips]
tctl sso configure
helps to construct a valid connector definition:tctl sso configure github ...
creates valid connector definitionstctl sso configure oidc ...
creates valid connector definitionstctl sso configure saml ...
creates valid connector definitionstctl sso test
test a provided connector definition, which can be loaded fromfile or piped in with
tctl sso configure
ortctl get --with-secrets
. Validconnectors are accepted, invalid are rejected with sensible error messages.
tctl sso test
.SSO login on remote host @atburke
tsh
should be running on a remote host (e.g. over an SSH session) and use thelocal browser to complete and SSO login. Run
tsh login --callback <remote.host>:<port> --bind-addr localhost:<port> --auth <auth>
on the remote host. Note that the
--callback
URL must be able to resolve to the--bind-addr
over HTTPS.Teleport Plugins @EdwardDowling
Teleport Operator @hugoShaka
teleport-cluster
Helm chart and the operator enabledAWS Node Joining @hugoShaka
Docs
ec2:DescribeInstances
permissions for local account:TELEPORT_TEST_EC2=1 go test ./integration -run TestEC2NodeJoin
TELEPORT_TEST_EC2=1 go test ./integration -run TestIAMNodeJoin
Kubernetes Node Joining @hugoShaka
Azure Node Joining @marcoandredinis
Docs
GCP Node Joining @hugoShaka
Docs
Cloud Labels @hugoShaka @marcoandredinis
and with tag
foo
:bar
. Verify that a node running on the instance has labelaws/foo=bar
.foo
:bar
. Verify that a node running on theinstance has label
azure/foo=bar
.Passwordless @codingllama
This feature has additional build requirements, so it should be tested with a
pre-release build from Drone (eg:
https://get.gravitational.com/tsh-v10.0.0-alpha.2.pkg
).This sections complements "Users -> Managing MFA devices".
tsh
binaries foreach operating system (Linux, macOS and Windows) must be tested separately for
FIDO2 items.
Diagnostics
Commands should pass all tests.
tsh fido2 diag
(macOS/Linux)tsh touchid diag
(macOS only)tsh webauthnwin diag
(Windows only)Registration
tsh mfa add
, choose WEBAUTHN andpasswordless)
tsh mfa add
, choose TOUCHID)tsh mfa add
, choose WEBAUTHN andpasswordless)
Login
tsh login --auth=passwordless
)tsh login --auth=passwordless
)tsh login --auth=passwordless --mfa-mode=cross-platform
uses FIDO2tsh login --auth=passwordless --mfa-mode=platform
uses platform authenticatortsh login --auth=passwordless --mfa-mode=auto
prefers platform authenticatorthe same device)
(
auth_service.authentication.passwordless = false
)(
auth_service.authentication.connector_name = passwordless
)(
tsh login --auth=local
)Touch ID support commands
tsh touchid ls
workstsh touchid rm
works (careful, may lock you out!)Device Trust @codingllama
Device Trust requires Teleport Enterprise.
This feature has additional build requirements, so it should be tested with a
pre-release build from Drone (eg:
https://get.gravitational.com/teleport-ent-v10.0.0-alpha.2-linux-amd64-bin.tar.gz
).Client-side enrollment requires a signed
tsh
for macOS, make sure to use thetsh
binary fromtsh.app
.A simple formula for testing device authorization is:
Inventory management
tctl devices add
)tctl devices add --enroll
)tctl devices ls
)tctl devices rm
)tctl devices rm
)tctl devices enroll
)tctl devices enroll
)Device enrollment
Enroll/authn device on macOS (
tsh device enroll
)Enroll/authn device on Windows (
tsh device enroll
)Enroll/authn device on Linux (
tsh device enroll
)Linux users need read/write permissions to /dev/tpmrm0. The simplest way is
to assign yourself to the
tss
group. Seehttps://goteleport.com/docs/access-controls/device-trust/device-management/#troubleshooting.
Verify device extensions on TLS certificate
Note that different accesses have different certificates (Database, Kube,
etc).
Verify device extensions on SSH certificate
Device authorization
device_trust.mode other than "off" or "" not allowed (OSS)
device_trust.mode="off" doesn't impede access (Enterprise and OSS)
device_trust.mode="optional" doesn't impede access, but issues device
extensions on login
device_trust.mode="required" enforces enrolled devices
device_trust.mode="required" is enforced by processes, and not only by
Auth APIs
Testing this requires issuing a certificate without device extensions
(mode="off"), then changing the cluster configuration to mode="required" and
attempting to access a process directly, without a login attempt.
Role-based authz enforces enrolled devices
(device_trust.mode="off" or "optional",
role.spec.options.device_trust_mode="required")
Device authorization works correctly for both require_session_mfa=false
and require_session_mfa=true
Device authorization applies to SSH access (all items above)
Device authorization applies to Trusted Clusters (root with
mode="optional" and leaf with mode="required")
Device authorization applies to Database access (all items above)
Device authorization applies to Kubernetes access (all items above)
Cluster-wide device authorization does not apply to App access
Role-based device authorization applies to App access
Device authorization does not apply to Windows Desktop access
(both cluster-wide and role) @ibeckermayer
Device audit (see lib/events/codes.go)
data (for certificates with device extensions)
Binary support
tsh
for macOS gives a sane errormessage for
tsh device enroll
attempts.Device support commands
tsh device collect
(macOS)tsh device asset-tag
(macOS)tsh device collect
(Windows)tsh device asset-tag
(Windows)tsh device collect
(Linux)tsh device asset-tag
(Linux)Hardware Key Support @jakule
Hardware Key Support is an Enterprise feature and is not available for OSS.
You will need a YubiKey 4.3+ to test this feature.
This feature has additional build requirements, so it should be tested with a pre-release build from Drone (eg:
https://get.gravitational.com/teleport-ent-v11.0.0-alpha.2-linux-amd64-bin.tar.gz
).Server Access
These tests should be carried out sequentially.
tsh
tests should be carried out on Linux, MacOS, and Windows.tsh login
as user with Webauthn login and no hardware key requirement.role.role_options.require_session_mfa: hardware_key
-tsh login --request-roles=hardware_key_required
tsh ssh
role.role_options.require_session_mfa: hardware_key_touch
-tsh login --request-roles=hardware_key_touch_required
tsh ssh
tsh logout
andtsh login
as the user with no hardware key requirement.auth_service.authentication.require_session_mfa: hardware_key
tsh ls
) should force automatic re-login with yubikeytsh ssh
auth_service.authentication.require_session_mfa: hardware_key_touch
tsh ls
) should force automatic re-login with yubikeytsh ssh
Other
Set
auth_service.authentication.require_session_mfa: hardware_key_touch
in your cluster auth settings.tsh proxy db --tunnel
HSM Support @nklaassen
Docs
Moderated session @strideynet
Using
tsh
join an SSH session as two moderators (two separate terminals, role requires one moderator).Ctrl+C
in the Implement a prototype for a proxying SSH server that implements concepts expressed in readme #1 terminal should disconnect the moderator.Ctrl+C
in the Implement a functional prototype #2 terminal should disconnect the moderator and terminate the session as session has no moderator.Using
tsh
join an SSH session as two moderators (two separate terminals, role requires one moderator).t
in any terminal should terminate the session for all participants.Performance @rosstimothy @fspmarshall
Scaling Test
Scale up the number of nodes/clusters a few times for each configuration below.
Perform reverse tunnel node scaling tests with actual nodes for Cloud:
Perform simulated node scaling tests with actual nodes via
tctl loadtest node-heartbeats --count=15000 --ttl=2m --interval=1m --labels=2 --concurrency=32
for:Perform the following additional scaling tests on etcd:
Soak Test
and via label based matching. Tests should only be run against a Cloud
tenant.
Concurrent Session Test
Run a concurrent session test that will spawn 5 interactive sessions per node in the cluster. Tests should only be run against a Cloud tenant:
Robustness
resources which do not require a moderated session and in async recording
mode from an already issued certificate.
which require a moderated session and in async recording mode from an already
issued certificate.
are restarted.
Teleport with Cloud Providers
AWS @camscale
GCP @tigrato
IBM @hugoShaka
Application Access @mdwn
debug_app: true
works.name.rootProxyPublicAddr
and well aspublicAddr
.name.rootProxyPublicAddr
.app.session.start
andapp.session.chunk
events are created in the Audit Log.app.session.chunk
points to a 5 minute session archive with multipleapp.session.request
events inside.tsh play <chunk-id>
can fetch and print a session chunk archive.tsh apps login
.tsh
commands.tsh aws
tsh aws --endpoint-url
(this is a hidden flag)tsh apps login
.tsh az
commands.tsh proxy az
andaz
commands.tsh apps login
.tsh gcloud
commands.tsh gsutil
commands.tsh proxy gcloud
andgcloud
/gsutil
commands.tctl create
.tctl create -f
.tctl rm
.Add Application
links to documentation.Database Access @greedy52 + team
IMPORTANT: for self-hosted databases, please verify before and after rotating
db
anddb_client
CAs. Note that databases need to be reconfigured withthe new certs and CAs. And some databases require slightly different setup:
update self-hosted database guides #36260
select pg_sleep(10)
followed by ctrl-c is a good query to test.)assume_role_arn: ""
andexternal_id: "<id>"
assume_role_arn: ""
andexternal_id: "<id>"
assume_role_arn: ""
andexternal_id: "<id>"
Verify all supported modes:
keep
,best_effort_drop
db.session.start
is emitted when you connect.db.session.end
is emitted when you disconnect.db.session.query
is emitted when you execute a SQL query.tsh db ls
shows only databases matching role'sdb_labels
.db_users
.db_names
.db.session.start
is emitted when connection attempt is denied.db_names
.db.session.query
is emitted when command fails due to permissions.tsh db connect
.tctl create
.tctl create -f
.tctl rm
.Please configure discovery in Discovery Service instead of Database Service.
assume_role_arn
andexternal_id
is set.name
,description
,type
, andlabels
Step 2
login value matching the rowsname
columnlabels
TLS Routing @smallinsky
v2
configuration starts only a single listener for proxy service, in contrast withv1
configuration. @smallinskyGiven configuration:
*:3080
for proxy service. Given the configuration above, 3022 and 3025 will be opened for other services.v1
, there should be additional ports 3023 and 3024.multiplex
modeauth_service.proxy_listener_mode: "multiplex"
@smallinskyweb_proxy_addr == tunnel_addr
tsh db connect
works through proxy running inmultiplex
modetsh proxy db
with a GUI client. @greedy52multiplex
modessh -o "ForwardAgent yes" -o "ProxyCommand tsh proxy ssh" [email protected]
ssh -o "ForwardAgent yes" -o "ProxyCommand tsh proxy ssh --user=%r --cluster=leaf-cluster %h:%p" [email protected]
tsh ssh
access through proxy running in multiplex modemultiplex
mode, usingtsh
multiplex
mode behind L7 load balancertsh login
andtctl
@greedy52tsh ssh
andtsh config
@smallinskytsh proxy db
andtsh db connect
@greedy52tsh proxy app
andtsh aws
@smallinskytsh proxy kube
@smallinskyIGS:
Access Monitoring @smallinsky
Access List @mdwn
Verify Okta Sync Service @tcsc
okta_import_rule
rule configuration.The text was updated successfully, but these errors were encountered: