diff --git a/docs/README.md b/docs/README.md index 0c24ff8f9920..8f0be780011f 100644 --- a/docs/README.md +++ b/docs/README.md @@ -38,6 +38,7 @@ Also see the [FAQ](./faq.md). ## Other Docs 1. [Integration Tests](../test/integration/connect/envoy/README.md) +1. [Upgrade Tests](../test/integration/consul-container/test/upgrade/README.md) ## Important Directories diff --git a/test/integration/consul-container/test/upgrade/README.md b/test/integration/consul-container/test/upgrade/README.md index 545b36cc0bf2..6815323af0df 100644 --- a/test/integration/consul-container/test/upgrade/README.md +++ b/test/integration/consul-container/test/upgrade/README.md @@ -1,29 +1,47 @@ -- [Upgrade Integration Tests](#upgrade-integration-tests) - - [Getting Started](#getting-started) - - [Prerequisites](#prerequisites) - - [Running Upgrade integration tests](#running-upgrade-integration-tests) - - [Adding a new upgrade integration test](#adding-a-new-upgrade-integration-test) - - [How it works](#how-it-works) - - [Errors Test Cases](#errors-test-cases) - - [FAQS](#faqs) # Upgrade Integration Tests +- [Introduction](#introduction) + - [How it works](#how-it-works) +- [Getting Started](#getting-started) + - [Prerequisites](#prerequisites) + - [Running Upgrade integration tests](#running-upgrade-integration-tests) +- [Adding a new upgrade integration test](#adding-a-new-upgrade-integration-test) + - [Errors Test Cases](#errors-test-cases) +- [FAQS](#faqs) + + +## Introduction + The goal of upgrade tests is to ensure problem-free upgrades on supported upgrade paths. At any given time, Consul supports the latest minor release, and two older minor releases, e.g. 1.15, 1.14, and 1.13. Upgrades to any higher version are permitted, including skipping a minor version e.g. from 1.13 to 1.15. The upgrade tests also aims to highlight errors that may occur as users attempt to upgrade their current version to a newer version. -Here is an example of how the upgrade tests work -1. Create a cluster with a specified number of server and client agents, then enable the feature to be tested. -2. Create some workload in the cluster; register 2 services: static-server, static-client. -3. Configure Consul intention to deny connection between static client and server. Ensure that a connection cannot be made. -4. Upgrade Consul cluster and restart the Envoy sidecars (we restart Envoy sidecar to ensure the upgraded Consul binary can read the state from the previous version and generate the correct Envoy configurations) -5. Verify connection / disconnection (e.g., deny Action) +### How it works + +This diagram illustrates the deployment architecture of an upgrade test, where +two consul agents (one server and one client), a static-server, static-client, +and envoy sidecars are deployed. + +isolated > Note that all consul agents and user workloads such as application services, mesh-gateway are running in docker containers. +In general, each upgrade test has following steps: +1. Create a cluster with a specified number of server and client agents, then enable the feature to be tested. +2. Create some workload in the cluster, e.g., registering 2 services: static-server, static-client. +Static-server is a simple http application and the upstream service of static-client. +3. Make additional configuration to the cluster. For example, configure Consul intention to deny +connection between static client and server. Ensure that a connection cannot be made. +4. Upgrade Consul cluster to the `target-version` and restart the Envoy sidecars +(we restart Envoy sidecar to ensure the upgraded Consul binary can read the state from +the previous version and generate the correct Envoy configurations) +5. Re-validate the client, server and sidecars to ensure the persisted data from the pervious +version can be accessed in the target version. Verify connection / disconnection +(e.g., deny Action) + ## Getting Started ### Prerequisites -If you wish to run or add new test cases, the following are required: +To run the upgrade test, the following tools are required: - install [Go](https://go.dev/) (the version should match that of our CI config's Go image). - install [`golangci-lint`](https://golangci-lint.run/usage/install/) - install [`Makefile`](https://www.gnu.org/software/make/manual/make.html) @@ -56,11 +74,32 @@ Below are the supported CLI options | -follow-log | true | Emit all container logs. These can be noisy, so we recommend `--follow-log=false` for local development. -## Adding a new upgrade integration test -Upgrade integration tests are defined in the [test/integration/consul-container/test/upgrade](/test/integration/consul-container/test/upgrade) subdirectory and new upgrade integration tests should always be added to this location. The test framework uses -[functional table-driven tests in Go](https://yourbasic.org/golang/table-driven-unit-test/) and using function types to modify the base value for each test case. +## Adding a new upgrade integration test + +All upgrade tests are defined in [test/integration/consul-container/test/upgrade](/test/integration/consul-container/test/upgrade) subdirectory. The test framework uses +[functional table-driven tests in Go](https://yourbasic.org/golang/table-driven-unit-test/) and +using function types to modify the basic configuration for each test case. + +Following is a guide for adding a new upgrade test case. +1. Create consul cluster(s) with a specified version. Some utility functions are provided to make +a single cluster or two peered clusters: -For tests with multiple test cases, it should always start by invoking +```go + // NewCluster creates a single cluster + cluster, _, _ := libtopology.NewCluster(t, &libtopology.ClusterConfig{ + NumServers: 1, + NumClients: 1, + BuildOpts: &libcluster.BuildOptions{ + Datacenter: "dc1", + ConsulVersion: oldVersion, + }, + }) + + // BasicPeeringTwoClustersSetup creates two peered clusters, named accpeting and dialing + accepting, dialing := libtopology.BasicPeeringTwoClustersSetup(t, oldVersion, false) +``` + +2. For tests with multiple test cases, it should always start by invoking ```go type testcase struct { name string @@ -68,7 +107,7 @@ For tests with multiple test cases, it should always start by invoking extraAssertion func() } ``` -see example [here](./hashicorp/consul/test/integration/consul-container/test/upgrade/l7_traffic_management/resolver_default_subset_test.go). For upgrade tests with a single test case, they can be written like +see example [here](./l7_traffic_management/resolver_default_subset_test.go). For upgrade tests with a single test case, they can be written like ```go run := func(t *testing.T, oldVersion, targetVersion string) { // insert test @@ -78,34 +117,53 @@ see example [here](./hashicorp/consul/test/integration/consul-container/test/upg run(t, utils.LatestVersion, utils.TargetVersion) }) ``` -see example [here](./hashicorp/consul/test/integration/consul-container/test/upgrade/acl_node_test.go) +see example [here](./acl_node_test.go) -### How it works -![Upgrade Tests Workflow](util/upgrade_tests_workflow.png?raw=true) +Addtitional configurations or user-workload can be created with a customized [`create` function](./l7_traffic_management/resolver_default_subset_test.go). -A Consul cluster is deployed, then a static-server, static-client and envoy sidecars are created in the cluster. An API request is made to the static-client to validate that it is ready. +3. Call the upgrade method and assert the upgrading cluster succeeds. +We also restart the envoy proxy to make sure the upgraded agent can generate +the correct envoy configurations. -Then we validate traffic between the static-server and static-client envoy sidecar. After validation, we take a snapshot and upgrade the Consul cluster to the `target-version` and restart the sidecars. Re-validate the client, server and sidecars to ensure the data snapshotted from the pervious version can be accessed in the latest version. +```go + err = cluster.StandardUpgrade(t, context.Background(), targetVersion) + require.NoError(t, err) + require.NoError(t, staticServerConnectProxy.Restart()) + require.NoError(t, staticClientConnectProxy.Restart()) +``` +4. Verify the user workload after upgrade, e.g., + +```go + libassert.HTTPServiceEchoes(t, "localhost", port, "") + libassert.AssertFortioName(t, fmt.Sprintf("http://localhost:%d", appPort), "static-server-2-v2", "") +``` ### Errors Test Cases There are some caveats for special error handling of versions prior to `1.14`. Upgrade tests for features such peering, had API changes that returns an error if attempt to upgrade, and should be accounted for in upgrade tests. If running upgrade tests for any version before `1.14`, the following lines of code needs to be added to skip test or it will not pass. ```go - fromVersion, err := version.NewVersion(utils.LatestVersion) - require.NoError(t, err) - if fromVersion.LessThan(utils.Version_1_14) { - continue - } + fromVersion, err := version.NewVersion(utils.LatestVersion) + require.NoError(t, err) + if fromVersion.LessThan(utils.Version_1_14) { + continue + } ``` See example [here](https://github.com/hashicorp/consul-enterprise/blob/005a0a92c5f39804cef4ad5c4cd6fd3334b95aa2/test/integration/consul-container/test/upgrade/peering_control_plane_mgw_test.go#L92-L96) -To write tests for bugs found during upgrades, see example on how to add a testcase for those scenarios [here](./hashicorp/consul/test/integration/consul-container/test/upgrade/fullstopupgrade_test.go). +To write tests for bugs found during upgrades, see example on how to add a testcase for those scenarios [here](./fullstopupgrade_test.go). ## FAQS -**Q.** To troubleshoot, how can I send API request or consul command to the deployed cluster? +**Q.** Are containers' ports (e.g., consul's 8500, envoy sidecar's admin port +or local upstream port) exposed on the docker host? \ +**A.** Yes, they are exposed. However, they are exposed through a [pod container](https://github.com/hashicorp/consul/blob/57e034b74621180861226a01efeb3e9cedc74d3a/test/integration/consul-container/libs/cluster/container.go#L132). +That is, a consul agent and the envoy proxy containers registered with the agent +share the [same Linux network namespace (i.e., they share `localhost`)](https://github.com/hashicorp/consul/blob/57e034b74621180861226a01efeb3e9cedc74d3a/test/integration/consul-container/libs/cluster/app.go#L23-L30) as the pod container. +The pod container use the same prefix as the consul agent in its name. + +**Q.** To troubleshoot, how can I send API request or consul command to the deployed cluster? \ **A.** To send an API request or command to the deployed cluster, ensure that a cluster, services and sidecars have been created. See example below: ```go cluster, _, _ := topology.NewCluster() @@ -118,10 +176,10 @@ To write tests for bugs found during upgrades, see example on how to add a testc ``` Then in your terminal `docker ps -a | grep consul` to get the running services and cluster. Exec in the cluster and run commands directly or make API request to `localhost:port` to relevant service or `localhost:adminPort` for envoy. -**Q.** To troubleshoot, how can I access the envoy admin page? +**Q.** To troubleshoot, how can I access the envoy admin page? \ **A.** To access envoy admin page, ensure that a cluster, services and sidecars have been created. Then get the adminPort for the client or server sidecar. See example on how to get the port above. Then navigate to a browser and go to the url `http://localhost:adminPort/` -**Q.** My test stuck with the error "could not start or join all agents: container 0: port not found"? +**Q.** My test is stuck with the error "could not start or join all agents: container 0: port not found"? \ **A.** Simply re-run the tests. If the error persists, prune docker images `docker system prune`, run `make dev-docker`, then re-run tests again. **Q.** How to clean up the resources created the upgrade test?