Teleport 14 Test Plan #31122

r0mant · 2023-08-28T23:26:25Z

Manual Testing Plan

Below are the items that should be manually tested with each release of Teleport.
These tests should be run on both a fresh installation of the version to be released
as well as an upgrade of the previous version of Teleport.

User accounting @atburke

Verify that active interactive sessions are tracked in /var/run/utmp on Linux.
Verify that interactive sessions are logged in /var/log/wtmp on Linux.

Combinations @marcoandredinis

For some manual testing, many combinations need to be tested. For example, for
interactive sessions the 12 combinations are below.

Teleport with EKS/GKE @tigrato

Deploy Teleport on a single EKS cluster
Deploy Teleport on two EKS clusters and connect them via trusted cluster feature
Deploy Teleport Proxy outside GKE cluster fronting connections to it (use this script to generate a kubeconfig)
Deploy Teleport Proxy outside EKS cluster fronting connections to it (use this script to generate a kubeconfig)

Teleport with multiple Kubernetes clusters @AntonAM

Note: you can use GKE or EKS or minikube to run Kubernetes clusters.
Minikube is the only caveat - it's not reachable publicly so don't run a proxy there.

Kubernetes auto-discovery @tigrato

Kubernetes Secret Storage @AntonAM

Kubernetes Secret storage for Agent's Identity
- Install Teleport agent with a short-lived token
  - Validate if the Teleport is installed as a Kubernetes Statefulset
  - Restart the agent after token TTL expires to see if it reuses the same identity.
- Force cluster CA rotation

Kubernetes RBAC @AntonAM

Teleport with FIPS mode @codingllama

Perform trusted clusters, Web and SSH sanity check with all teleport components deployed in FIPS mode.

ACME @bl-nero

Teleport can fetch TLS certificate automatically using ACME protocol.

Migrations @bl-nero

Migrate trusted clusters from 2.4.0 to 2.5.0
- Migrate auth server on main cluster, then rest of the servers on main cluster
  SSH should work for both main and old clusters
- Migrate auth server on remote cluster, then rest of the remote cluster
  SSH should work

Command Templates

When interacting with a cluster, the following command templates are useful:

OpenSSH

# when connecting to the recording proxy, `-o 'ForwardAgent yes'` is required.
ssh -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %[email protected] -s proxy:%h:%p" \
  node.example.com

# the above command only forwards the agent to the proxy, to forward the agent
# to the target node, `-o 'ForwardAgent yes'` needs to be passed twice.
ssh -o "ForwardAgent yes" \
  -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %[email protected] -s proxy:%h:%p" \
  node.example.com

# when connecting to a remote cluster using OpenSSH, the subsystem request is
# updated with the name of the remote cluster.
ssh -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %[email protected] -s proxy:%h:%[email protected]" \
  node.foo.com

Teleport

# when connecting to a OpenSSH node, remember `-p 22` needs to be passed.
tsh --proxy=proxy.example.com --user=<username> --insecure ssh -p 22 node.example.com

# an agent can be forwarded to the target node with `-A`
tsh --proxy=proxy.example.com --user=<username> --insecure ssh -A -p 22 node.example.com

# the --cluster flag is used to connect to a node in a remote cluster.
tsh --proxy=proxy.example.com --user=<username> --insecure ssh --cluster=foo.com -p 22 node.foo.com

Teleport with SSO Providers

GitHub External SSO @capnspacehook

Teleport OSS
- GitHub organization without external SSO succeeds
- GitHub organization with external SSO fails
Teleport Enterprise
- GitHub organization without external SSO succeeds
- GitHub organization with external SSO succeeds

`tctl sso` family of commands @tcsc

For help with setting up sso connectors, check out the [Quick GitHub/SAML/OIDC Setup Tips]

tctl sso configure helps to construct a valid connector definition:

tctl sso configure github ... creates valid connector definitions
tctl sso configure oidc ... creates valid connector definitions
tctl sso configure saml ... creates valid connector definitions

tctl sso test test a provided connector definition, which can be loaded from
file or piped in with tctl sso configure or tctl get --with-secrets. Valid
connectors are accepted, invalid are rejected with sensible error messages.

Teleport Plugins @EdwardDowling

Test receiving a message via Teleport Slackbot
Test receiving a new Jira Ticket via Teleport Jira

AWS Node Joining @atburke

Docs

On EC2 instance with ec2:DescribeInstances permissions for local account:
TELEPORT_TEST_EC2=1 go test ./integration -run TestEC2NodeJoin
On EC2 instance with any attached role:
TELEPORT_TEST_EC2=1 go test ./integration -run TestIAMNodeJoin
EC2 Join method in IoT mode with node and auth in different AWS accounts
IAM Join method in IoT mode with node and auth in different AWS accounts

Kubernetes Node Joining @hugoShaka

Join a Teleport node running in the same Kubernetes cluster via a Kubernetes ProvisionToken

Azure Node Joining @tcsc

Docs

Join a Teleport node running in an Azure VM

GCP Node Joining @tcsc

Docs

Join a Teleport node running in a GCP VM.

Cloud Labels @tcsc

Create an EC2 instance with tags in instance metadata enabled
and with tag foo: bar. Verify that a node running on the instance has label
aws/foo=bar.
Create an Azure VM with tag foo: bar. Verify that a node running on the
instance has label azure/foo=bar.

Passwordless @codingllama

This feature has additional build requirements, so it should be tested with a pre-release build from Drone (eg: https://get.gravitational.com/teleport-v10.0.0-alpha.2-linux-amd64-bin.tar.gz).

This sections complements "Users -> Managing MFA devices". tsh binaries for
each operating system (Linux, macOS and Windows) must be tested separately for
FIDO2 items.

Device Trust @codingllama

Device Trust requires Teleport Enterprise.

This feature has additional build requirements, so it should be tested with a
pre-release build from Drone (eg:
https://get.gravitational.com/teleport-v10.0.0-alpha.2-linux-amd64-bin.tar.gz).

Client-side enrollment requires a signed tsh for macOS, make sure to use the
tsh binary from tsh.app.

A simple formula for testing device authorization is:

# Before enrollment.
# Replace with other kinds of access, as appropriate (db, kube, etc)
tsh ssh node-that-requires-device-trust
> ERROR: ssh: rejected: administratively prohibited (unauthorized device)

# Register the device.
# Get the serial number from "Apple -> About This Mac".
tctl devices add --os=macos --asset-tag=<SERIAL_NUMBER> --enroll

# Enroll the device.
tsh device enroll --token=<TOKEN_FROM_COMMAND_ABOVE>
tsh logout; tsh login

# After enrollment
tsh ssh node-that-requires-device-trust
> $

Hardware Key Support @jakule

Hardware Key Support is an Enterprise feature and is not available for OSS.

You will need a YubiKey 4.3+ to test this feature.

This feature has additional build requirements, so it should be tested with a pre-release build from Drone (eg: https://get.gravitational.com/teleport-ent-v11.0.0-alpha.2-linux-amd64-bin.tar.gz).

Server Access

These tests should be carried out sequentially. tsh tests should be carried out on Linux, MacOS, and Windows.

tsh login as user with Webauthn login and no hardware key requirement.
Request a role with role.role_options.require_session_mfa: hardware_key - tsh login --request-roles=hardware_key_required

Assuming the role should force automatic re-login with yubikey
tsh ssh
- Requires yubikey to be connected for re-login
- Prompts for per-session MFA

Request a role with role.role_options.require_session_mfa: hardware_key_touch - tsh login --request-roles=hardware_key_touch_required

Assuming the role should force automatic re-login with yubikey
- Prompts for touch if not cached (last touch within 15 seconds)
tsh ssh
- Requires yubikey to be connected for re-login
- Prompts for touch if not cached

tsh logout and tsh login as the user with no hardware key requirement.
Upgrade auth settings to auth_service.authentication.require_session_mfa: hardware_key

Using the existing login session (tsh ls) should force automatic re-login with yubikey
tsh ssh
- Requires yubikey to be connected for re-login
- Prompts for per-session MFA

Upgrade auth settings to auth_service.authentication.require_session_mfa: hardware_key_touch

Using the existing login session (tsh ls) should force automatic re-login with yubikey
- Prompts for touch if not cached
tsh ssh
- Requires yubikey to be connected for re-login
- Prompts for touch if not cached

Other

Set auth_service.authentication.require_session_mfa: hardware_key_touch in your cluster auth settings.

Database Access: tsh proxy db --tunnel

HSM Support @tobiaszheller

Docs

Moderated session @tobiaszheller

Using tsh join an SSH session as two moderators (two separate terminals, role requires one moderator).

Ctrl+C in the Implement a prototype for a proxying SSH server that implements concepts expressed in readme #1 terminal should disconnect the moderator.
Ctrl+C in the Implement a functional prototype #2 terminal should disconnect the moderator and terminate the session as session has no moderator.

Using tsh join an SSH session as two moderators (two separate terminals, role requires one moderator).

t in any terminal should terminate the session for all participants.

Performance @rosstimothy @fspmarshall @espadolini

Scaling Test

Scale up the number of nodes/clusters a few times for each configuration below.

Verify that there are no memory/goroutine/file descriptor leaks
Compare the baseline metrics with the previous release to determine if resource usage has increased
Restart all Auth instances and verify that all nodes/clusters reconnect

Perform reverse tunnel node scaling tests for all backend configurations:

etcd - 10k
DynamoDB - 10k
Firestore - 10k
Postgres - 10k

Perform the following additional scaling tests on DynamoDB:

10k direct dial nodes.
500 trusted clusters.

Soak Test

Run 30 minute soak test directly against direct and tunnel nodes
and via label based matching. Tests should be run against a Cloud
tenant.

tsh bench ssh --duration=30m user@direct-dial-node ls
tsh bench ssh --duration=30m user@reverse-tunnel-node ls
tsh bench ssh --duration=30m user@foo=bar ls
tsh bench ssh --duration=30m --random user@foo ls

Concurrent Session Test

Cluster with 1k reverse tunnel nodes

Run a concurrent session test that will spawn 5 interactive sessions per node in the cluster:

tsh bench web sessions --max=5000 user ls

Verify that all 5000 sessions are able to be established.
Verify that tsh and the web UI are still functional.

Robustness

Connectivity Issues:

Verify that a lack of connectivity to Auth does not prevent access to
resources which do not require a moderated session and in async recording
mode from an already issued certificate.
Verify that a lack of connectivity to Auth prevents access to resources
which require a moderated session and in async recording mode from an already
issued certificate.
Verify that an open session is not terminated when all Auth instances
are restarted.

Teleport with Cloud Providers

AWS @camscale

Deploy Teleport to AWS. Using DynamoDB & S3
Deploy Teleport Enterprise to AWS. Using HA Setup https://gravitational.com/teleport/docs/aws-terraform-guide/

GCP @tigrato

Deploy Teleport to GCP. Using Cloud Firestore & Cloud Storage
Deploy Teleport to GKE. Google Kubernetes engine.
Deploy Teleport Enterprise to GCP.

IBM @hugoShaka

Deploy Teleport to IBM Cloud. Using IBM Database for etcd & IBM Object Store
Deploy Teleport to IBM Cloud Kubernetes.
Deploy Teleport Enterprise to IBM Cloud.
Deploy Teleport to IBM Cloud. Using IDB Databases for Postgres.

Application Access @mdwn

Database Access @smallinsky + team

TLS Routing @smallinsky

Verify that teleport proxy v2 configuration starts only a single listener for proxy service, in contrast with v1 configuration.
Given configuration: @smallinsky

version: v2
proxy_service:
  enabled: "yes"
  public_addr: ['root.example.com']
  web_listen_addr: 0.0.0.0:3080

There should be total of three listeners, with only *:3080 for proxy service. Given the configuration above, 3022 and 3025 will be opened for other services.

lsof -i -P | grep teleport | grep LISTEN
  teleport  ...  TCP *:3022 (LISTEN)
  teleport  ...  TCP *:3025 (LISTEN)
  teleport  ...  TCP *:3080 (LISTEN) # <-- proxy service

In contrast for the same configuration with version v1, there should be additional ports 3023 and 3024.

lsof -i -P | grep teleport | grep LISTEN
  teleport  ...  TCP *:3022 (LISTEN)
  teleport  ...  TCP *:3025 (LISTEN)
  teleport  ...  TCP *:3023 (LISTEN) # <-- extra proxy service port
  teleport  ...  TCP *:3024 (LISTEN) # <-- extra proxy service port
  teleport  ...  TCP *:3080 (LISTEN) # <-- proxy service

Run Teleport Proxy in multiplex mode auth_service.proxy_listener_mode: "multiplex"
- Trusted cluster
  - Setup trusted clusters using single port setup web_proxy_addr == tunnel_addr
```
kind: trusted_cluster
spec:
  ...
  web_proxy_addr: root.example.com:443
  tunnel_addr: root.example.com:443
  ...
```
Database Access
- Verify that tsh db connect works through proxy running in multiplex mode
  - Postgres @smallinsky
  - MySQL @Tener
  - MariaDB @Tener
  - MongoDB @GavinFrazar
  - CockroachDB @Tener
  - Redis @greedy52
  - MSSQL @gabrielcorado
  - Snowflake @smallinsky
  - Elasticsearch. @Tener
  - OpenSearch. @Tener
  - Cassandra/ScyllaDB. @gabrielcorado
  - Oracle. @smallinsky
  - ClickHouse. @smallinsky
- Verify connecting to a database through TLS ALPN SNI local proxy tsh db proxy with a GUI client. @gabrielcorado @greedy52 @GavinFrazar @Tener @smallinsky (For each tested db protocol)
Application Access @smallinsky
- Verify app access through proxy running in multiplex mode
SSH Access @gabrielcorado
- Connect to a OpenSSH server through a local ssh proxy ssh -o "ForwardAgent yes" -o "ProxyCommand tsh proxy ssh" [email protected]
- Connect to a OpenSSH server on leaf-cluster through a local ssh proxyssh -o "ForwardAgent yes" -o "ProxyCommand tsh proxy ssh --user=%r --cluster=leaf-cluster %h:%p" [email protected]
- Verify tsh ssh access through proxy running in multiplex mode
Kubernetes access: @smallinsky
- Verify kubernetes access through proxy running in multiplex mode
Teleport Proxy single port multiplex mode behind L7 load balancer @Tener
- Agent can join through Proxy and maintain reverse tunnel
- tsh login and tctl
- SSH Access: tsh ssh and tsh config
- Database Access: tsh proxy db and tsh db connect
- Application Access: tsh proxy app and tsh aws
- Kubernetes Access: tsh proxy kube

Assist

Assist is not supported by tsh and WebUI is the only way to use it.
Assist test plan is in the core section instead of WebUI as most functionality is implemented in the core.

Configuration @xacrimon
- Assist is disabled by default (OSS, Enterprise)
- Assist can be enabled in the configuration file.
- Assist is disabled in the Cloud.
- Assist is enabled by default in the Cloud Team plan.
- Assist is always disabled when etcd is used as a backend.
Conversations @xacrimon
- A new conversation can be started.
- SSH command can be executed on one server.
- SSH command can be executed on multiple servers.
- SSH command can be executed on a node with per session MFA enabled.
- Execution output is explained when it fits the context window.
- Assist can list all nodes/execute a command on all nodes (using embeddings).
- Access request can be created.
- Access request is created when approved.
- Conversation title is set after the first message.
SSH integration @xacrimon
- Assist icon is visible in WebUI's Terminal
- A Bash command can be generated in the above window.
- When an output is selected in the Terminal "Explain" option is available, and it generates the summary.

The text was updated successfully, but these errors were encountered:

r0mant · 2023-08-28T23:26:58Z

Test plan cont'd (due to Github's issue description size limit).

Desktop Access @ibeckermayer @probakowski

Binaries compatibility @fheinecke

Verify tsh runs on:
- Windows 10
- MacOS

Machine ID @strideynet

SSH

With a default Teleport instance configured with a SSH node:

Verify you are able to create a new bot user with tctl bots add robot --roles=access. Follow the instructions provided in the output to start tbot
Verify you are able to connect to the SSH node using openssh with the generated ssh_config in the destination directory
Verify that after the renewal period (default 20m, but this can be reduced via configuration), that newly generated certificates are placed in the destination directory
Verify that sending both SIGUSR1 and SIGHUP to a running tbot process causes a renewal and new certificates to be generated
Verify that you are able to make a connection to the SSH node using the ssh_config provided by tbot after each phase of a manual CA rotation.

Ensure the above tests are completed for both:

Directly connecting to the auth server
Connecting to the auth server via the proxy reverse tunnel

DB Access

With a default Postgres DB instance, a Teleport instance configured with DB access and a bot user configured:

Verify you are able to connect to and interact with a database using tbot db while tbot start is running

Host users creation @jakule

Host users creation docs
Host users creation RFD

Verify host users creation functionality
- non-existing users are created automatically
- users are added to groups
  - non-existing configured groups are created
  - created users are added to the teleport-system group
- users are cleaned up after their session ends
  - cleanup occurs if a program was left running after session ends
- sudoers file creation is successful
  - Invalid sudoers files are not created
- existing host users are not modified
- setting disable_create_host_user: true stops user creation from occurring

CA rotations @espadolini

Verify the CA rotation functionality itself (by checking in the backend or with tctl get cert_authority)
- standby phase: only active_keys, no additional_trusted_keys
- init phase: active_keys and additional_trusted_keys
- update_clients and update_servers phases: the certs from the init phase are swapped
- standby phase: only the new certs remain in active_keys, nothing in additional_trusted_keys
- rollback phase (second pass, after completing a regular rotation): same content as in the init phase
- standby phase after rollback: same content as in the previous standby phase
Verify functionality in all phases (clients might have to log in again in lieu of waiting for credentials to expire between phases)
- SSH session in tsh from a previous phase
- SSH session in web UI from a previous phase
- New SSH session with tsh
- New SSH session with web UI
- New SSH session in a child cluster on the same major version
- New SSH session in a child cluster on the previous major version
- New SSH session from a parent cluster
- Application access through a browser
- Application access through curl with tsh apps login
- kubectl get po after tsh kube login
- Database access (no configuration change should be necessary if the database CA isn't rotated, other Teleport functionality should not be affected if only the database CA is rotated)

Proxy Peering

Proxy Peering docs

Verify that Proxy Peering works for the following protocols:
- SSH @capnspacehook
- Kubernetes @AntonAM
- Database @greedy52
- Windows Desktop @ibeckermayer
- App Access @gabrielcorado

EC2 Discovery @marcoandredinis

EC2 Discovery docs

Verify EC2 instance discovery
- Only EC2 instances matching given AWS tags have the installer executed on them
- Only the IAM permissions mentioned in the discovery docs are required for operation
- Custom scripts specified in different matchers are executed
- Custom SSM documents specified in different matchers are executed
- New EC2 instances with matching AWS tags are discovered and added to the teleport cluster
  - Large numbers of EC2 instances (51+) are all successfully added to the cluster
- Nodes that have been discovered do not have the install script run on the node multiple times

Azure Discovery @hugoShaka

Azure Discovery docs

Verify Azure VM discovery
- Only Azure VMs matching given Azure tags have the installer executed on them
- Only the IAM permissions mentioned in the discovery docs are required for operation (Refresh Azure Discovery documentation #31602)
- Custom scripts specified in different matchers are executed
- New Azure VMs with matching Azure tags are discovered and added to the teleport cluster
  - Large numbers of Azure VMs (51+) are all successfully added to the cluster
- Nodes that have been discovered do not have the install script run on the node multiple times (Azure VM auto-discovery keeps executing commands after successful installation #28879)

GCP Discovery @tcsc

GCP Discovery docs

Verify GCP instance discovery
- Only GCP instances matching given GCP tags have the installer executed on them
- Only the IAM permissions mentioned in the discovery docs are required for operation
- Custom scripts specified in different matchers are executed
- New GCP instances with matching GCP tags are discovered and added to the teleport cluster
  - Large numbers of GCP instances (51+) are all successfully added to the cluster
- Nodes that have been discovered do not have the install script run on the node multiple times

IP Pinning @AntonAM

Add a role with pin_source_ip: true (requires Enterprise) to test IP pinning.
Testing will require changing your IP (that Teleport Proxy sees).
Docs: IP Pinning

Verify that it works for SSH Access
- You can access tunnel node with tsh ssh on root cluster
- You can access direct access node with tsh ssh on root cluster
- You can access tunnel node from Web UI on root cluster
- You can access direct access node from Web UI on root cluster
- You can access tunnel node with tsh ssh on leaf cluster
- You can access direct access node with tsh ssh on leaf cluster
- You can access tunnel node from Web UI on leaf cluster
- You can access direct access node from Web UI on leaf cluster
- You can download files from nodes in Web UI (small arrows at top left corner)
- If you change your IP you no longer can access nodes.
Verify that it works for Kube Access
- You can access Kubernetes cluster through standalone Kube service on root cluster
- You can access Kubernetes cluster through agent inside Kubernetes on root cluster
- You can access Kubernetes cluster through standalone Kube service on leaf cluster
- You can access Kubernetes cluster through agent inside Kubernetes on leaf cluster
- If you change your IP you no longer can access Kube clusters.
Verify that it works for DB Access
- You can access DB servers on root cluster
- You can access DB servers on leaf cluster
- If you change your IP you no longer can access DB servers.
Verify that it works for App Access
- You can access App service on root cluster
- You can access App service on leaf cluster
- If you change your IP you no longer can access App services.
Verify that it works for Desktop Access
- You can access Desktop service on root cluster
- You can access Desktop service on leaf cluster
- If you change your IP you no longer can access Desktop services.

Resources

Quick GitHub/SAML/OIDC Setup Tips

zmb3 · 2023-08-30T17:46:35Z

~~Looks like passwordless registration broke due to a dependency update: #31187~~

Edit: fixed

espadolini · 2023-08-31T11:24:27Z

PostgreSQL 10k test

Setup

Azure Database for PostgreSQL Flexible Server 15.3 on GP_Standard_D2ds_v4 (2 vCPU, 8GiB of RAM) with 128GiB of storage and Zone-Redundant HA. The kv table was manually altered to REPLICA IDENTITY FULL before running tests (which has no effect on Teleport 14.0.0-alpha.2 but increases the WAL load somewhat).

AKS with Kubernetes 1.26.6, 15 nodes Standard_D16s_v3 (16 vCPU, 64GiB of RAM), same region as the database (northeurope).

Teleport configured with 3 auths and 3 proxies, PostgreSQL pool_max_conns=50.

10k tunnel nodes

Metrics

Soak test

Ran from a pod in the same cluster as the control plane and the nodes.

# tsh bench --duration=30m ssh root@agents-5c8876d478-zp7bg-27 ls

* Requests originated: 17999
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         142 ms            
50         148 ms            
75         155 ms            
90         163 ms            
95         170 ms            
99         239 ms            
100        2251 ms           

# tsh bench --duration=30m ssh --random root@all ls

* Requests originated: 17998
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         169 ms            
50         180 ms            
75         215 ms            
90         242 ms            
95         253 ms            
99         294 ms            
100        2191 ms           

# tsh bench --duration=30m ssh root@fullname=agents-5c8876d478-226qm-25 ls 

* Requests originated: 17997
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         256 ms            
50         267 ms            
75         280 ms            
90         298 ms            
95         322 ms            
99         364 ms            
100        1408 ms

5k sessions (on 1k tunnel nodes) then 10k direct connect nodes

Metrics

Soak test

Ran from a pod in the same cluster as the control plane and the nodes.

root@ubu2:/# tsh bench --duration=30m ssh root@agents-5c8876d478-zx6k5-23 ls

* Requests originated: 17999
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         146 ms            
50         151 ms            
75         157 ms            
90         165 ms            
95         171 ms            
99         189 ms            
100        1854 ms           

root@ubu2:/# tsh bench --duration=30m ssh --random root@all ls

* Requests originated: 17998
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         171 ms            
50         186 ms            
75         225 ms            
90         248 ms            
95         261 ms            
99         285 ms            
100        2185 ms           

root@ubu2:/# tsh bench --duration 30m ssh root@fullname=agents-5c8876d478-z497h-24 ls

* Requests originated: 17998
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         295 ms            
50         307 ms            
75         325 ms            
90         350 ms            
95         370 ms            
99         412 ms            
100        2395 ms           

root@ubu2:/#

Concurrent sessions test

No errors reported by tsh bench web session --max=5000 joe ls, however the backend metrics show a sizeable latency spike at the end of the test, here shown on three separate attempts:

This is not immediate cause for concern, as real workloads will never shut down 5000 sessions at the exact same time, but it might be possible to improve the behavior with some tuning (reducing the size of the connection pool might help, reducing the actual need for retries due to contention).

strideynet · 2023-08-31T13:15:08Z

In addition to test plan tasks, Machine ID was also test for Kubernetes Access and Application Access ( I will add these to test plan before T15 )

codingllama · 2023-08-31T21:57:05Z

~~tsh windows panics on mfa add: #31333.~~

Edit: solved.

atburke · 2023-09-01T00:09:15Z

Loading all CAs for tsh ssh is broken: #31339

tcsc · 2023-09-04T00:21:24Z

tctl sso configure github is broken (possibly only in Enterprise cluster). See #31396 (Fix in #31397)

lxea · 2023-09-05T10:06:23Z

~~tsh join <agentless-node> seems to be broken #31422~~
This was never supported.

GavinFrazar · 2023-09-06T01:04:00Z

I scratched off items for "Test Databases screen in the web UI" since that screen was removed and replaced by the unified resource view.

I did verify that searching, filtering, sorting etc work in the unified resources view for databases however. Need to update those testplan steps @avatus

avatus · 2023-09-06T04:23:12Z

Will do, thanks @GavinFrazar . #31214

mdwn · 2023-09-06T14:44:05Z

teleport app start appears to be broken. #31496

espadolini · 2023-09-07T00:38:42Z

Firestore 10k test

Metrics

Soak test

Single node, random and label-based (single node) 30min soak tests passed with no errors.

mdwn · 2023-09-07T14:36:52Z

AWS roles do not show up when trying to log into the AWS console in the UI in the new unified resource view: #31573

hugoShaka · 2023-09-07T19:23:40Z

Azure Discovery keeps running the discovery script on already-joined VMs every 10 minutes, but it seems the bug was here in 13: #28879

hugoShaka · 2023-09-07T19:35:01Z

Azure Discovery permissions are not up-to-date and following the docs doesn't allow to setup a working discovery service: #31602

rosstimothy · 2023-09-07T21:10:58Z

Agents running versions older than v14 are not able to connect to a v14 cluster: #31607

rosstimothy · 2023-09-08T14:09:21Z

Cloud Load Tests

30k Scaling Test

https://grafana-staging-onprem.platform.teleport.sh/goto/60SnOjzIR?orgId=1

10k Concurrent Sessions

Soak Tests

Origin: us-east-1 Target: us-east-1

kubectl logs -n soaktest -f pod/soaktest-7zpdz-56wqw
+ tsh --proxy=benchmark.cloud.gravitational.io:443 -i /etc/teleport/auth bench ssh --duration=30m root@node-agents-766996b7b9-zv7b2-09 ls

* Requests originated: 17999
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         191 ms
50         195 ms
75         201 ms
90         210 ms
95         216 ms
99         270 ms
100        5199 ms

+ tsh --proxy=benchmark.cloud.gravitational.io:443 -i /etc/teleport/auth bench ssh --duration=30m root@fullname=node-agents-766996b7b9-zv7b2-09 ls

* Requests originated: 17996
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         466 ms
50         473 ms
75         481 ms
90         493 ms
95         505 ms
99         551 ms
100        1193 ms

+ tsh --proxy=benchmark.cloud.gravitational.io:443 -i /etc/teleport/auth bench ssh --duration=30m --random root@all ls

* Requests originated: 17999
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         188 ms
50         195 ms
75         203 ms
90         215 ms
95         229 ms
99         284 ms
100        9191 ms

https://grafana-staging-onprem.platform.teleport.sh/goto/QDvmOCzIR?orgId=1

Origin: us-west-2 Target: us-east-1

kubectl logs -n soaktest -f pod/soaktest-rkbzk-zvf2z
+ tbot start --data-dir=/var/lib/teleport/bot --destination-dir=/opt/machine-id --token=163cbdb82281e399049c8034ef77219b --join-method=token --auth-server=benchmark.cloud.gravitational.io:443 --certificate-ttl=8h --oneshot
  [TBOT]      INFO Anonymous telemetry is not enabled. Find out more about Machine ID's anonymous telemetry at https://goteleport.com/docs/machine-id/reference/telemetry/ tbot/anonymous_telemetry.go:82
  [TBOT]      INFO Created directory "/var/lib/teleport/bot" config/destination_directory.go:135
  [TBOT]      INFO Created directory "/opt/machine-id" config/destination_directory.go:135
  [TBOT]      INFO Initializing bot identity. tbot/tbot.go:254
  [TBOT]      INFO Loading existing bot identity from store. store:directory: /var/lib/teleport/bot tbot/tbot.go:325
  [TBOT]      INFO No existing bot identity found in store. Bot will join using configured token. tbot/tbot.go:329
  [TBOT]      INFO Fetching bot identity using token. tbot/bot_identity.go:193
  [AUTH]      INFO Attempting registration via proxy server. auth/register.go:278
  [AUTH]      INFO Successfully registered via proxy server. auth/register.go:285
  [TBOT]      INFO Fetched new bot identity. identity:valid: after=2023-09-05T19:16:47Z, before=2023-09-06T03:17:47Z, duration=8h1m0s | kind=tls, renewable=true, disallow-reissue=false, roles=[bot-soaktest-bot], principals=[-teleport-internal-join], generation=1 tbot/tbot.go:298
  [TBOT]      INFO Bot initialization complete. tbot/tbot.go:316
  [TBOT]      INFO One-shot mode enabled. Generating outputs. tbot/tbot.go:118
  [TBOT]      INFO Generating output. output:identity (directory: /opt/machine-id) tbot/impersonated_identity.go:528
  [TBOT]      INFO Generated output. output:identity (directory: /opt/machine-id) tbot/impersonated_identity.go:573
  [TBOT]      INFO Generated outputs. One-shot mode is enabled so exiting. tbot/tbot.go:123
+ tsh --proxy=benchmark.cloud.gravitational.io:443 -i /opt/machine-id/identity bench ssh --duration=30m root@node-agents-5d68d45658-25b9q-00 ls

* Requests originated: 17992
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         881 ms
50         892 ms
75         912 ms
90         930 ms
95         936 ms
99         951 ms
100        5619 ms

+ tsh --proxy=benchmark.cloud.gravitational.io:443 -i /opt/machine-id/identity bench ssh --duration=30m root@fullname=node-agents-5d68d45658-25b9q-00 ls

* Requests originated: 17991
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         903 ms
50         917 ms
75         948 ms
90         1027 ms
95         1039 ms
99         1054 ms
100        1136 ms

https://grafana-staging-onprem.platform.teleport.sh/goto/S80kdCzIR?orgId=1

Origin: us-east-1 Target: us-west-2

kubectl logs -n soaktest -f pod/soaktest-9vg9b-k458s
+ tbot start --data-dir=/var/lib/teleport/bot --destination-dir=/opt/machine-id --token=b29218b11195ef04f77b1ab93b7382fc --join-method=token --auth-server=benchmark.cloud.gravitational.io:443 --certificate-ttl=8h --oneshot
  INFO [TBOT]      Created directory "/var/lib/teleport/bot" config/destination_directory.go:135
  INFO [TBOT]      Anonymous telemetry is not enabled. Find out more about Machine ID's anonymous telemetry at https://goteleport.com/docs/machine-id/reference/telemetry/ tbot/anonymous_telemetry.go:82
  INFO [TBOT]      Created directory "/opt/machine-id" config/destination_directory.go:135
  INFO [TBOT]      Initializing bot identity. tbot/tbot.go:254
  INFO [TBOT]      Loading existing bot identity from store. store:directory: /var/lib/teleport/bot tbot/tbot.go:325
  INFO [TBOT]      No existing bot identity found in store. Bot will join using configured token. tbot/tbot.go:329
  INFO [TBOT]      Fetching bot identity using token. tbot/bot_identity.go:193
  INFO [AUTH]      Attempting registration via proxy server. auth/register.go:278
  INFO [AUTH]      Successfully registered via proxy server. auth/register.go:285
  INFO [TBOT]      Fetched new bot identity. identity:valid: after=2023-09-05T20:25:33Z, before=2023-09-06T04:26:32Z, duration=8h0m59s | kind=tls, renewable=true, disallow-reissue=false, roles=[bot-soaktest-bot], principals=[-teleport-internal-join], generation=1 tbot/tbot.go:298
  INFO [TBOT]      Bot initialization complete. tbot/tbot.go:316
  INFO [TBOT]      One-shot mode enabled. Generating outputs. tbot/tbot.go:118
  INFO [TBOT]      Generating output. output:identity (directory: /opt/machine-id) tbot/impersonated_identity.go:528
  INFO [TBOT]      Generated output. output:identity (directory: /opt/machine-id) tbot/impersonated_identity.go:573
  INFO [TBOT]      Generated outputs. One-shot mode is enabled so exiting. tbot/tbot.go:123
+ tsh --proxy=benchmark.cloud.gravitational.io:443 -i /opt/machine-id/identity bench ssh --duration=30m root@node-agents-5d68d45658-z8nz7-00 ls

* Requests originated: 17992
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         854 ms
50         863 ms
75         868 ms
90         875 ms
95         880 ms
99         910 ms
100        1270 ms

+ tsh --proxy=benchmark.cloud.gravitational.io:443 -i /opt/machine-id/identity bench ssh --duration=30m root@fullname=node-agents-5d68d45658-z8nz7-00 ls

* Requests originated: 17989
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         1125 ms
50         1133 ms
75         1143 ms
90         1151 ms
95         1159 ms
99         1191 ms
100        3631 ms

https://grafana-staging-onprem.platform.teleport.sh/goto/S80kdCzIR?orgId=1

fspmarshall · 2023-09-09T02:51:45Z

ETCD 10k Loadtest (simulated)

This is a first attempt at creating a "simulated" loadtesting procedure. The procedure is simulated in that it stresses the backend by using tctl loadtest node-heartbeats rather than by creating actual teleport nodes. The backend and control-plane are created per-usual.

For this initial attempt the following command was run concurrently on each auth pod (note the 5k count, totaling 10k since two auth pods were created):

tctl loadtest node-heartbeats --count=5000 --ttl=2m --interval=1m --labels=2 --concurrency=32

This method of loadtesting produces significantly less load on the control-plane generally, but roughly equivalent load on the backend itself:

In addition to the above metrics, auth logs were specifically monitored for any cache and/or event system related errors. While we don't anticipate such things at 10k these days, such errors would be one of the main signs of regression that might not be immediately obvious.

As a point of future improvement, I think we should start tracking cache resets and watcher buffer overflows as part of our standard suite of backend metrics so that we can better monitor the health of the event system.

fspmarshall · 2023-09-09T03:32:17Z

ETCD 30k Loadtest (simulated)

See #31122 (comment) for explanation of simulated loadtest procedure. The 30K procedure was identical, except that 15k heartbeats were applied per auth-server instead of 5k.

As with the 10k procedure, logs were explicitly monitored for cache and event system issues. None were observed, but the metrics improvement thoughts from that comment still stand.

gabrielcorado · 2023-09-09T03:54:53Z

Database Access load test (PostgreSQL and MySQL)

Setup

EKS with a single node group:

Min: 2, Max: 10 instances.
Instance class: m5.4xlarge
Kubernetes version: 1.27

Teleport cluster (all deployed on the EKS cluster):

DynamoDB backend
3 Auth servers
3 Proxies instances
1 Database Agent

Databases:

Single PostgreSQL RDS instance on a db.t4g.xlarge instance class. Accessed through RDS Proxy with single RW endpoint.
Single MySQL RDS instance on a db.t4g.xlarge instance class. Accessed through RDS Proxy with single RW endpoint.

Note: Databases were configured using discovery running inside the database agent.

tsh bench commands were executed inside the cluster.

MySQL

10 connections/second

# tsh bench mysql mysql-proxy-rdsproxy --db-user=mysql --db-name=mysql --rate=10 --duration=30m

* Requests originated: 18000
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         60 ms
50         63 ms
75         68 ms
90         75 ms
95         81 ms
99         100 ms
100        3105 ms

50 connections/second

# tsh bench mysql mysql-proxy-rdsproxy --db-user=mysql --db-name=mysql --rate=50 --duration=30m

* Requests originated: 89951
* Requests failed: 81
* Last error: io.ReadFull(header) failed. err EOF: connection was bad

Histogram

Percentile Response Duration
---------- -----------------
25         520 ms
50         709 ms
75         880 ms
90         1036 ms
95         1142 ms
99         1363 ms
100        2289 ms

Notes

The failed connection happened at the end of the benchmark test, where the final connections didn't have a chance to complete as tsh bench canceled them.

PostgreSQL

10 connections/second

# tsh bench postgres postgres-proxy-rdsproxy --db-user=postgres --db-name=postgres --rate=10 --duration=30m

* Requests originated: 18000
* Requests failed: 0

Histogram

Percentile Response Duration
---------- -----------------
25         74 ms
50         77 ms
75         82 ms
90         87 ms
95         93 ms
99         113 ms
100        1112 ms

50 connections/second

# tsh bench postgres postgres-proxy-rdsproxy --db-user=postgres --db-name=postgres --rate=50 --duration=30m

* Requests originated: 89916
* Requests failed: 6813
* Last error: failed to connect to `host=127.0.0.1 user=postgres database=postgres`: server error (: failed to connect to any of the database servers (SQLSTATE ))

Histogram

Percentile Response Duration
---------- -----------------
25         712 ms
50         1080 ms
75         1403 ms
90         1623 ms
95         1743 ms
99         1999 ms
100        2963 ms

Notes

Most of the connection failures were due to the proxy not being able to communicate with the database agent.

Logs

WARN [DB:PROXY]  Failed to dial database DatabaseServer(Name=gabrielcorado-loadtest-postgres-proxy-rdsproxy-us-east-1-278576220453, Version=14.0.0-alpha.2, Hostname=database-agents-0, HostID=203cb416-7a12-4075-a8d8-c92bd0b1c7c1, Database=Database(Name=gabrielcorado-loadtest-postgres-proxy-rdsproxy-us-east-1-278576220453, Type=rdsproxy, Labels=map[account-id:278576220453 engine:POSTGRESQL loadtest:gabrielcorado-loadtest region:us-east-1 teleport.dev/cloud:AWS teleport.dev/origin:cloud teleport.internal/discovered-name:gabrielcorado-loadtest-postgres-proxy vpc-id:vpc-0452b380742d5815c])). error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError Teleport proxy failed to connect to &#34;db&#34; agent &#34;@local-node&#34; over reverse tunnel:
  no tunnel connection found: no db reverse tunnel for 203cb416-7a12-4075-a8d8-c92bd0b1c7c1.gabrielcorado-loadtest.teleportdemo.net found
This usually means that the agent is offline or has disconnected. Check the
agent logs and, if the issue persists, try restarting it or re-registering it
with the cluster.
Stack Trace:
       github.com/gravitational/teleport/lib/reversetunnel/localsite.go:582 github.com/gravitational/teleport/lib/reversetunnel.(*localSite).getConn
       github.com/gravitational/teleport/lib/reversetunnel/localsite.go:306 github.com/gravitational/teleport/lib/reversetunnel.(*localSite).DialTCP
       github.com/gravitational/teleport/lib/reversetunnel/localsite.go:274 github.com/gravitational/teleport/lib/reversetunnel.(*localSite).Dial
       github.com/gravitational/teleport/lib/srv/db/proxyserver.go:471 github.com/gravitational/teleport/lib/srv/db.(*ProxyServer).Connect
       github.com/gravitational/teleport/lib/srv/db/postgres/proxy.go:102 github.com/gravitational/teleport/lib/srv/db/postgres.(*Proxy).handleConnection
       github.com/gravitational/teleport/lib/srv/db/postgres/proxy.go:66 github.com/gravitational/teleport/lib/srv/db/postgres.(*Proxy).HandleConnection
       github.com/gravitational/teleport/lib/srv/db/proxyserver.go:349 github.com/gravitational/teleport/lib/srv/db.(*ProxyServer).handleConnection
       github.com/gravitational/teleport/lib/srv/db/proxyserver.go:303 github.com/gravitational/teleport/lib/srv/db.(*ProxyServer).ServeTLS.func1
       runtime/asm_amd64.s:1650 runtime.goexit
User Message: Teleport proxy failed to connect to &#34;db&#34; agent &#34;@local-node&#34; over reverse tunnel:
  no tunnel connection found: no db reverse tunnel for 203cb416-7a12-4075-a8d8-c92bd0b1c7c1.gabrielcorado-loadtest.teleportdemo.net found
This usually means that the agent is offline or has disconnected. Check the
agent logs and, if the issue persists, try restarting it or re-registering it
with the cluster.] db/proxyserver.go:483

During the entire test the database agent was logging a warning of Failed to emit audit event db.session.query(TDB02I). This server's connection to the auth service appears to be slow. events/emitter.go:113 for all DB session events.

Worth noting that during the tests a single Audit instance was handling all the audit events, which could cause the delayed processing:

camscale · 2023-09-11T07:34:52Z

~~Teleport fails to start with "distant" DynamoDB backend: #31690~~

tigrato · 2023-09-11T15:00:47Z

Kubernetes Access load test

Setup

EKS with a single node group:

5 instances.
Instance class: m5.4xlarge
Kubernetes version: 1.27

Teleport cluster (all deployed on the EKS cluster):

DynamoDB backend
3 Auth servers
3 Proxies instances
1 Kubernetes Agent

tsh bench commands were executed inside the cluster.

`kubectl get pods`

This test involves forwarding the request to the upstream service, unmarshal the response, filtering it, and returning the response back to the end user.

10 connections/second

tsh bench kube ls my-cluster --rate=10 --duration=30m

* Requests originated: 18000
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         28 ms             
50         29 ms             
75         30 ms             
90         33 ms             
95         36 ms             
99         47 ms             
100        167 ms

50 connections/second

tsh bench kube ls my-cluster --duration=30m --rate 50

* Requests originated: 89999
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         28 ms             
50         29 ms             
75         32 ms             
90         36 ms             
95         43 ms             
99         81 ms             
100        735 ms

100 connections/second


tsh bench kube ls my-cluster --duration=30m --rate 100

* Requests originated: 179998
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         26 ms             
50         28 ms             
75         32 ms             
90         47 ms             
95         72 ms             
99         176 ms            
100        743 ms

`kubectl exec`

For Kubernetes exec we are using the same cluster as above, but we are executing date command inside a Pod.
There are some limitations arround SPDY executor that prevents us from getting higher throughput than 30 commands/second. This is a known issue and we are going to address it using a different executor for Kubernetes 1.29+.

Until there, we are limited to lower numbers of commands/second.

5 connections/second

tsh bench --duration=10m --rate=5  kube exec my-cluster  ubuntu date

* Requests originated: 3000
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         122 ms            
50         134 ms            
75         145 ms            
90         155 ms            
95         160 ms            
99         173 ms            
100        437 ms

30 connections/second

This test was achieved by running 3 tsh bench commands in parallel and combining the results.

tsh bench --duration=10m --rate=30  kube exec my-cluster  ubuntu date

* Requests originated: 18000
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         121 ms            
50         134 ms            
75         144 ms            
90         154 ms            
95         161 ms            
99         180 ms            
100        438 ms

smallinsky · 2023-09-12T11:40:03Z

tbot db access backward compatibility issue: #31750

hugoShaka · 2023-09-12T15:43:24Z

Not sure if this is a bug, Azure VMs belonging to Scale Sets are not discovered: #31758

fspmarshall · 2023-09-12T21:46:40Z

500 Trusted Clusters (etcd)

capnspacehook · 2023-09-12T22:27:25Z

Playing a leaf SSH session recorded at the proxy fails: #31776

tcsc · 2023-09-13T13:59:42Z

GCP Discovery appears totally broken. Existing issue: #31386

AntonAM · 2023-09-13T22:46:36Z

IP pinned users can't upload/download files after connecting to nodes in web UI #31845

tigrato · 2023-09-15T14:00:58Z

#30248 broke some Kubernetes watch streams when they were not very active or too slow - i.e. the watcher created by kubectl run -it alpine --image alpine --command sh to receive the pod status.
Fix #31945

fspmarshall · 2023-09-16T03:44:40Z

ETCD Soak Tests

tsh bench ssh --duration=30m root@node ls

* Requests originated: 17999
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         103 ms            
50         108 ms            
75         119 ms            
90         125 ms            
95         128 ms            
99         146 ms            
100        326 ms

tsh bench ssh --duration=30m root@label=value ls

* Requests originated: 17999
* Requests failed: 0

Histogram

Percentile Response Duration 
---------- ----------------- 
25         111 ms            
50         114 ms            
75         127 ms            
90         132 ms            
95         135 ms            
99         156 ms            
100        5519 ms

tsh bench ssh --duration=30m --random root@all ls

* Requests originated: 17999
* Requests failed: 345
* Last error: failed connecting to host ac849c5d-1406-4995-baae-a0f793079190:0: failed to receive cluster details response
	failed to dial target host
	Teleport proxy failed to connect to "node" agent "@local-node" over reverse tunnel:

  no tunnel connection found: no node reverse tunnel for ac849c5d-1406-4995-baae-a0f793079190.fspm-loadtest.teleport-test.com found

This usually means that the agent is offline or has disconnected. Check the
agent logs and, if the issue persists, try restarting it or re-registering it
with the cluster.

Histogram

Percentile Response Duration 
---------- ----------------- 
25         106 ms            
50         117 ms            
75         124 ms            
90         135 ms            
95         138 ms            
99         149 ms            
100        334 ms

note: missing node during --random was unrelated to the test.

r0mant added the test-plan A list of tasks required to ship a successful product release. label Aug 28, 2023

This was referenced Aug 31, 2023

redis driver spams error logs when accessing redis database #31303

Closed

tsh db connect <rds-primary-endpoint> throws matches multiple databases error #31286

Closed

greedy52 mentioned this issue Sep 1, 2023

context canceled spams Proxy log for every database connection #31367

Closed

tobiaszheller assigned tobiaszheller and unassigned tobiaszheller Sep 6, 2023

hugoShaka mentioned this issue Sep 12, 2023

[v14] terraform: Add support for TLS routing and other improvements #31736

Merged

zmb3 closed this as completed Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Teleport 14 Test Plan #31122

Teleport 14 Test Plan #31122

r0mant commented Aug 28, 2023 •

edited by smallinsky

Loading

r0mant commented Aug 28, 2023 •

edited by AntonAM

Loading

zmb3 commented Aug 30, 2023 •

edited

Loading

espadolini commented Aug 31, 2023 •

edited

Loading

strideynet commented Aug 31, 2023

codingllama commented Aug 31, 2023 •

edited

Loading

atburke commented Sep 1, 2023

tcsc commented Sep 4, 2023 •

edited

Loading

lxea commented Sep 5, 2023 •

edited

Loading

GavinFrazar commented Sep 6, 2023

avatus commented Sep 6, 2023

mdwn commented Sep 6, 2023

espadolini commented Sep 7, 2023 •

edited

Loading

mdwn commented Sep 7, 2023

hugoShaka commented Sep 7, 2023 •

edited

Loading

hugoShaka commented Sep 7, 2023

rosstimothy commented Sep 7, 2023

rosstimothy commented Sep 8, 2023

fspmarshall commented Sep 9, 2023

fspmarshall commented Sep 9, 2023 •

edited

Loading

gabrielcorado commented Sep 9, 2023

Notes

Notes

camscale commented Sep 11, 2023 •

edited by rosstimothy

Loading

tigrato commented Sep 11, 2023 •

edited

Loading

smallinsky commented Sep 12, 2023

hugoShaka commented Sep 12, 2023

fspmarshall commented Sep 12, 2023

capnspacehook commented Sep 12, 2023 •

edited

Loading

tcsc commented Sep 13, 2023

AntonAM commented Sep 13, 2023

tigrato commented Sep 15, 2023

fspmarshall commented Sep 16, 2023

Teleport 14 Test Plan #31122

Teleport 14 Test Plan #31122

Comments

r0mant commented Aug 28, 2023 • edited by smallinsky Loading

Manual Testing Plan

User accounting @atburke

Combinations @marcoandredinis

Teleport with EKS/GKE @tigrato

Teleport with multiple Kubernetes clusters @AntonAM

Kubernetes auto-discovery @tigrato

Kubernetes Secret Storage @AntonAM

Kubernetes RBAC @AntonAM

Teleport with FIPS mode @codingllama

ACME @bl-nero

Migrations @bl-nero

Command Templates

OpenSSH

Teleport

Teleport with SSO Providers

GitHub External SSO @capnspacehook

tctl sso family of commands @tcsc

Teleport Plugins @EdwardDowling

AWS Node Joining @atburke

Kubernetes Node Joining @hugoShaka

Azure Node Joining @tcsc

GCP Node Joining @tcsc

Cloud Labels @tcsc

Passwordless @codingllama

Device Trust @codingllama

Hardware Key Support @jakule

Server Access

Other

HSM Support @tobiaszheller

Moderated session @tobiaszheller

Performance @rosstimothy @fspmarshall @espadolini

Scaling Test

Soak Test

Concurrent Session Test

Robustness

Teleport with Cloud Providers

AWS @camscale

GCP @tigrato

IBM @hugoShaka

Application Access @mdwn

Database Access @smallinsky + team

TLS Routing @smallinsky

Assist

r0mant commented Aug 28, 2023 • edited by AntonAM Loading

Test plan cont'd (due to Github's issue description size limit).

Desktop Access @ibeckermayer @probakowski

Binaries compatibility @fheinecke

Machine ID @strideynet

SSH

DB Access

Host users creation @jakule

CA rotations @espadolini

Proxy Peering

EC2 Discovery @marcoandredinis

Azure Discovery @hugoShaka

GCP Discovery @tcsc

IP Pinning @AntonAM

Resources

zmb3 commented Aug 30, 2023 • edited Loading

espadolini commented Aug 31, 2023 • edited Loading

PostgreSQL 10k test

Setup

10k tunnel nodes

Metrics

Soak test

5k sessions (on 1k tunnel nodes) then 10k direct connect nodes

Metrics

Soak test

Concurrent sessions test

strideynet commented Aug 31, 2023

codingllama commented Aug 31, 2023 • edited Loading

atburke commented Sep 1, 2023

tcsc commented Sep 4, 2023 • edited Loading

lxea commented Sep 5, 2023 • edited Loading

GavinFrazar commented Sep 6, 2023

avatus commented Sep 6, 2023

r0mant commented Aug 28, 2023 •

edited by smallinsky

Loading

`tctl sso` family of commands @tcsc

r0mant commented Aug 28, 2023 •

edited by AntonAM

Loading

zmb3 commented Aug 30, 2023 •

edited

Loading

espadolini commented Aug 31, 2023 •

edited

Loading

codingllama commented Aug 31, 2023 •

edited

Loading

tcsc commented Sep 4, 2023 •

edited

Loading

lxea commented Sep 5, 2023 •

edited

Loading

espadolini commented Sep 7, 2023 •

edited

Loading

hugoShaka commented Sep 7, 2023 •

edited

Loading

fspmarshall commented Sep 9, 2023 •

edited

Loading

camscale commented Sep 11, 2023 •

edited by rosstimothy

Loading

tigrato commented Sep 11, 2023 •

edited

Loading

`kubectl get pods`

`kubectl exec`

`kubectl exec`

capnspacehook commented Sep 12, 2023 •

edited

Loading