From d6df681839076b90a99c9a3907e4e9303be6d67f Mon Sep 17 00:00:00 2001 From: Are Almaas Date: Wed, 5 Mar 2025 10:49:14 +0100 Subject: [PATCH 1/4] docs: add monitoring, ci-cd and infrastructure docs (#1994) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Description This pull request simplifies the deployment documentation by removing the detailed GitHub Actions steps and manual deployment instructions from the README. The removed content has been replaced with pointers directing users to new documentation files. Additionally, three new documentation files have been added under the docs/ directory to cover CI/CD processes, infrastructure management, and monitoring capabilities. Consider this as a starting point for documentation of monitoring/infrastructure/ci-cd ## Related Issue(s) - #1967 ## Verification - [ ] **Your** code builds clean without any errors or warnings - [ ] Manual testing done (required) - [ ] Relevant automated test added (if you find this hard, leave it and we'll help out) ## Documentation - [ ] Documentation is updated (either in `docs`-directory, Altinnpedia or a separate linked PR in [altinn-studio-docs.](https://github.com/Altinn/altinn-studio-docs), if applicable) --------- Co-authored-by: Ole Jørgen Skogstad --- README.md | 143 +------------------------- docs/CI-CD.md | 223 +++++++++++++++++++++++++++++++++++++++++ docs/Infrastructure.md | 179 +++++++++++++++++++++++++++++++++ docs/Monitoring.md | 97 ++++++++++++++++++ 4 files changed, 503 insertions(+), 139 deletions(-) create mode 100644 docs/CI-CD.md create mode 100644 docs/Infrastructure.md create mode 100644 docs/Monitoring.md diff --git a/README.md b/README.md index 33eeab1b8..2804ff7e1 100644 --- a/README.md +++ b/README.md @@ -311,93 +311,16 @@ builder.Configuration For pull requests, the title must follow [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/). The title of the PR will be used as the commit message when squashing/merging the pull request, and the body of the PR will be used as the description. -This title will be used to generate the changelog (using [Release Please](https://github.com/google-github-actions/release-please-action)) +This title will be used to generate the changelog (using [Release Please](https://github.com/googleapis/release-please-action)) Using `fix` will add to "Bug Fixes", `feat` will add to "Features", `chore` will add to "Miscellaneous Chores". All the others, `test`, `ci`, `trivial` etc., will be ignored. ([Example release](https://github.com/altinn/dialogporten/releases/tag/v1.12.0)) ## Deployment This repository contains code for both infrastructure and applications. Configurations for infrastructure are located in `.azure/infrastructure`. Application configuration is in `.azure/applications`. -### Deployment process +### Deployment process / GitHub actions -Deployments are done using `GitHub Actions` with the following steps: - -#### 1. Create and Merge Pull Request -- **Action**: Create a pull request. -- **Merge**: Once the pull request is reviewed and approved, merge it into the `main` branch. - -#### 2. Build and Deploy to Test -- **Trigger**: Merging the pull request into `main`. -- **Action**: The code is built and deployed to the test environment. -- **Tag**: The deployment is tagged with `-`. - -#### 3. Prepare Release for Staging -- **Passive**: Release-please creates or updates a release pull request. -- **Purpose**: This generates a changelog and bumps the version number. -- **Merge**: Merge the release pull request into the `main` branch. - -#### 4. Deploy to Staging and YT01 (Bump Version and Create Tag) -- **Trigger**: Merging the release pull request. -- **Action**: - - Bumps the version number. - - Generates the release and changelog. - - Deployment is tagged with the new `` without `` - - The new version is built and deployed to the staging environment (tt02) and the performance environment (yt01). - -#### 5. Prepare deployment to Production -- **Action**: Perform a dry run towards the production environment to ensure the deployment can proceed without issues. - -#### 6. Deploy to Production -- **Trigger**: Manual trigger of workflow, specify the version to deploy. -- **Action**: The specified version is deployed to the production environment. - -#### Visual Workflow - -![Deployment process](docs/deploy-process.png) - -[Release Please](https://github.com/google-github-actions/release-please-action) is used to create releases, generate changelog and bumping version numbers. - -`CHANGELOG.md` and `version.txt` are automatically updated and should not be changed manually. - -### Manual deployment (⚠️ handle with care) - -This project uses two GitHub dispatch workflows to manage manual deployments: `dispatch-apps.yml` and `dispatch-infrastructure.yml`. These workflows allow for manual triggers of deployments through GitHub Actions, providing flexibility for deploying specific versions to designated environments. - -#### Using `dispatch-apps.yml` - -The `dispatch-apps.yml` workflow is responsible for deploying applications. To trigger this workflow: - -1. Navigate to the Actions tab in the GitHub repository. -2. Select the `Dispatch Apps` workflow. -3. Click on "Run workflow". -4. Fill in the required inputs: - - **environment**: Choose the target environment (`test`, `yt01`, `staging`, or `prod`). - - **version**: Specify the version to deploy. Could be git tag or a docker-tag published in packages. - - **runMigration** (optional): Indicate whether to run database migrations (`true` or `false`). - -This workflow will handle the deployment of applications based on the specified parameters, ensuring that the correct version is deployed to the chosen environment. - -#### Using `dispatch-infrastructure.yml` - -The `dispatch-infrastructure.yml` workflow is used for deploying infrastructure components. To use this workflow: - -1. Go to the Actions tab in the GitHub repository. -2. Select the `Dispatch Infrastructure` workflow. -3. Click on "Run workflow". -4. Provide the necessary inputs: - - **environment**: Select the environment you wish to deploy to (`test`, `yt01`, `staging`, or `prod`). - - **version**: Enter the version to deploy, which should correspond to a git tag. (e.g., `1.23.4`). - -This workflow facilitates the deployment of infrastructure to the specified environment, using the version details provided. - -### GitHub Actions - -Naming conventions for GitHub Actions: -- `workflow-*.yml`: Reusable workflows -- `ci-cd-*.yml`: Workflows that are triggered by an event -- `dispatch-*.yml`: Workflows that are dispatchable - -The `workflow-check-for-changes.yml` workflow uses the `tj-actions/changed-files` action to check which files have been altered since last commit or tag. We use this filter to ensure we only deploy backend code or infrastructure if the respective files have been altered. +See [docs/CI-CD.md](docs/CI-CD.md) ### Infrastructure @@ -407,51 +330,7 @@ For example, to add a new storage account, you would: - Create or update a Bicep file within the `.azure/infrastructure` folder to include the storage account resource definition. - Ensure that the Bicep file is referenced correctly in `.azure/infrastructure/infrastructure.bicep` to be included in the deployment process. -Refer to the existing infrastructure definitions as templates for creating new components. - -#### Deploying a new infrastructure environment - -A few resources need to be created before we can apply the Bicep to create the main resources. - -The resources refer to a `source key vault` in order to fetch the necessary secrets and store them in the key vault for the environment. An `ssh`-key is also necessary for the `ssh-jumper` used to access the resources in Azure within the `vnet`. - -Use the following steps: - -- Ensure a `source key vault` exist for the new environment. Either create a new key vault or use an existing key vault. Currently, two key vaults exist for our environments. One in the test subscription used by Test and Staging, and one in our Production subscription, which Production uses. Ensure you add the necessary secrets that should be used by the new environment. Read here to learn about secret convention [Configuration Guide](docs/Configuration.md). Ensure also that the key vault has the following enabled: `Azure Resource Manager for template deployment`. - -- Ensure that a role assignment `Key Vault Secrets User` and `Contributer`(should be inherited) is added for the service principal used by the GitHub Entra Application. - -- Create an SSH key in Azure and discard the private key. We will use the `az cli` to access the virtual machine so storing the `ssh key` is only a security risk. - -- Create a new environment in GitHub and add the following secrets: `AZURE_CLIENT_ID`, `AZURE_SOURCE_KEY_VAULT_NAME`, `AZURE_SOURCE_KEY_VAULT_RESOURCE_GROUP`, `AZURE_SOURCE_KEY_VAULT_SUBSCRIPTION_ID`, `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID` and `AZURE_SOURCE_KEY_VAULT_SSH_JUMPER_SSH_PUBLIC_KEY` - -- Add a new file for the environment `.azure/infrastructure/.bicepparam`. `` must match the environment created in GitHub. - -- Add the new environment in the `dispatch-infrastructure.yml` list of environments. - -- Run the GitHub action `Dispatch infrastructure` with the `version` you want to deploy and `environment`. All the resources in `.azure/infrastructure/main.bicep` should now be created. - -- (The GitHub action might need to restart because of a timeout when creating Redis). - -#### Connecting to resources in Azure - -There is a `ssh-jumper` virtual machine deployed with the infrastructure. This can be used to create a `ssh`-tunnel into the `vnet`. There are two ways to establish connections: - -1. Using `az ssh` commands directly: - ```bash - # Connect to the VNet using: - az ssh vm --resource-group dp-be--rg --vm-name dp-be--ssh-jumper - - # Or create an SSH tunnel for specific resources (e.g., PostgreSQL database): - az ssh vm -g dp-be--rg -n dp-be--ssh-jumper -- -L 5432::5432 - ``` - This example forwards the PostgreSQL default port (5432) to your localhost. Adjust the ports and hostnames as needed for other resources. - - You may be prompted to install the ssh extension. - -2. Using the forwarding utility script: - - See [scripts/database-forwarder/README.md](scripts/database-forwarder/README.md) for a more user-friendly way to establish database connections through SSH. +Refer to [docs/Infrastructure.md](docs/Infrastructure.md) for more detailed information. ### Applications @@ -464,17 +343,3 @@ For example, to add a new application named `web-api-new`, you would: - Add parameter files for each environment (e.g., `test.bicepparam`, `staging.bicepparam`) to specify environment-specific values. Refer to the existing applications like `web-api-so` and `web-api-eu` as templates. - -#### Deploying applications in a new infrastructure environment - -Ensure you have followed the steps in [Deploying a new infrastructure environment](#deploying-a-new-infrastructure-environment) to have the resources required for the applications. - -Use the following steps: - -- From the infrastructure resources created, add the following GitHub secrets in the new environment (this will not be necessary in the future as secrets would be added directly from infrastructure deployment): `AZURE_APP_CONFIGURATION_NAME`, `AZURE_APP_INSIGHTS_CONNECTION_STRING`, `AZURE_CONTAINER_APP_ENVIRONMENT_NAME`, `AZURE_ENVIRONMENT_KEY_VAULT_NAME`, `AZURE_REDIS_NAME`, `AZURE_RESOURCE_GROUP_NAME`, `AZURE_SERVICE_BUS_NAMESPACE_NAME` and `AZURE_SLACK_NOTIFIER_FUNCTION_APP_NAME` - -- Add new parameter files for the environment in all applications `.azure/applications/*/.bicepparam` - -- Run the GitHub action `Dispatch applications` in order to deploy all applications to the new environment. - -- To expose the applications through APIM, see [Common APIM Guide](docs/CommonAPIM.md) diff --git a/docs/CI-CD.md b/docs/CI-CD.md new file mode 100644 index 000000000..8905e4e5c --- /dev/null +++ b/docs/CI-CD.md @@ -0,0 +1,223 @@ +# Dialogporten CI/CD Documentation + +Naming conventions for GitHub Actions: +- `workflow-*.yml`: Reusable workflows +- `ci-cd-*.yml`: Workflows that are triggered by an event +- `dispatch-*.yml`: Workflows that are dispatchable + +## Dialogporten CI/CD Flow + +### 1. Development & Merge Process + +1. **Development** + - Create feature branch from `main` + - Follow branch naming convention: `(feat|fix|docs|test|ci|chore|trivial)!?(\\(.*\\))?!?:.*` + - Create PR against `main` + - PR title must follow conventional commits format (validated by `ci-cd-pull-request-title.yml`) + - Get code review and approval + - Merge to `main` + +2. **Main Branch Triggers** +When code is merged to `main`, two parallel workflows are triggered: + + a. **CI/CD Main** (`ci-cd-main.yml`) + - Automatically deploys to Test environment + - Runs full deployment including: + - Infrastructure if changed + - Applications if changed + - Runs tests + - Updates dependencies + + b. **Release Please** (`ci-cd-release-please.yml`) + - Checks if changes warrant a new release + - Either: + - Creates/updates release PR, or + - Builds and publishes Docker images if release is complete + +### 2. Release & Deployment Flow + +#### When Release is Created/Published: +Three parallel workflows are triggered: + +1. **Production Dry Run** (`ci-cd-prod-dry-run.yml`) + - Validates production deployment configuration + - No actual deployment + - Early warning for potential production issues + +2. **Staging Deployment** (`ci-cd-staging.yml`) + - Deploys to staging (tt02) environment + - Full deployment including: + - Infrastructure updates + - Application deployment + - Database migrations + - SDK publishing + - End-to-end testing + +3. **YT01 Deployment** (`ci-cd-yt01.yml`) + - Deploys to YT01 environment + - Performance testing environment + - Full deployment similar to staging + +#### Production Deployment +- **Manual Trigger Required** (`ci-cd-prod.yml`) +- Requires specific version input +- Full deployment process: + - Version verification + - Infrastructure deployment + - Application deployment + - SDK publishing + - Version tracking updates + +### 3. Environment Flow +``` +Development → Main Branch → Test → [YT01 + Staging] → Production + ↑ ↑ ↑ + Auto deploy Auto deploy Manual deploy + on merge on release with version +``` + +### 4. Environment Purposes + +- **Test**: Automatic deployment target for all changes merged to main +- **YT01**: Performance test environment, automatically updated with releases +- **Staging (tt02)**: Pre-production verification, automatically updated with releases +- **Production**: Production environment, requires manual deployment trigger + +### 5. Manual Control Options + +Available manual workflows for all environments: +- `dispatch-infrastructure.yml`: Infrastructure deployment +- `dispatch-apps.yml`: Application deployment +- `dispatch-k6-tests.yml`: Functional testing +- `dispatch-k6-performance.yml`: Performance testing +- `dispatch-k6-breakpoint.yml`: Breakpoint testing + +#### Using `dispatch-apps.yml` + +The `dispatch-apps.yml` workflow is responsible for deploying applications. To trigger this workflow: + +1. Navigate to the Actions tab in the GitHub repository. +2. Select the `Dispatch Apps` workflow. +3. Click on "Run workflow". +4. Fill in the required inputs: + - **environment**: Choose the target environment (`test`, `yt01`, `staging`, or `prod`). + - **version**: Specify the version to deploy. Could be git tag or a docker-tag published in packages. + - **runMigration** (optional): Indicate whether to run database migrations (`true` or `false`). + +This workflow will handle the deployment of applications based on the specified parameters, ensuring that the correct version is deployed to the chosen environment. + +#### Using `dispatch-infrastructure.yml` + +The `dispatch-infrastructure.yml` workflow is used for deploying infrastructure components. To use this workflow: + +1. Go to the Actions tab in the GitHub repository. +2. Select the `Dispatch Infrastructure` workflow. +3. Click on "Run workflow". +4. Provide the necessary inputs: + - **environment**: Select the environment you wish to deploy to (`test`, `yt01`, `staging`, or `prod`). + - **version**: Enter the version to deploy, which should correspond to a git tag. (e.g., `1.23.4`). + +This workflow facilitates the deployment of infrastructure to the specified environment, using the version details provided. + + +### 6. Version Management + +- Release-please manages versioning based on conventional commits +- Versions are tracked in GitHub environment variables +- Separate tracking for infrastructure and applications +- Docker images tagged with release versions +- SDK and schema packages versioned with releases + +[Release Please](https://github.com/googleapis/release-please-action) is used to create releases, generate changelog and bumping version numbers. + +`CHANGELOG.md` and `version.txt` are automatically updated and should not be changed manually. + +### 7. Visual Workflow + +![Deployment process](deploy-process.png) + +## Version Tracking and Change Detection + +### 1. Version Storage Purpose +- GitHub environment variables store the latest deployed versions for each environment +- Separate tracking for: + - Infrastructure version (`LATEST_DEPLOYED_INFRA_VERSION`) + - Applications version (`LATEST_DEPLOYED_APPS_VERSION`) +- This enables accurate detection of what needs to be deployed in each environment + +### 2. Change Detection Process (`workflow-check-for-changes.yml`) + +1. **Version Comparison** + - Retrieves latest deployed versions from GitHub environment variables + - Compares current deployment version with last deployed version + - Uses git commit SHAs to determine exact changes between versions + +2. **Change Categories Tracked** + ```yaml + Changes detected in: + - Infrastructure (Azure resources, GitHub workflows) + - Backend code + - Web API client + - Test files + - Swagger schema + - GraphQL schema + - Database migrations + - Slack notifier + ``` + +3. **Smart Deployment Decisions** + - Only deploys components that have actually changed + - Infrastructure deployment skipped if no infrastructure changes + - App deployment skipped if no application changes + - Migrations run only when database changes exist + - SDK published only on API/schema changes + +### 3. Implementation Example + +```yaml +# Getting latest deployed versions +get-versions-from-github: + name: Get Latest Deployed Version Info from GitHub + uses: ./.github/workflows/workflow-get-latest-deployed-version-info-from-github.yml + with: + environment: prod + secrets: + GH_TOKEN: ${{ secrets.RELEASE_VERSION_STORAGE_PAT }} + +# Checking for changes +check-for-changes: + name: Check for changes + needs: [get-versions-from-github] + uses: ./.github/workflows/workflow-check-for-changes.yml + with: + infra_base_sha: ${{ needs.get-versions-from-github.outputs.infra_version_sha }} + apps_base_sha: ${{ needs.get-versions-from-github.outputs.apps_version_sha }} +``` + +### 4. Example Workflow + +1. **New Release Created (v1.2.3)** + ```plaintext + Current State: + - Production: v1.2.1 + - Changes detected: + • Infrastructure: No changes + • Backend code: Modified + • Database: New migration + ``` + +2. **Deployment Process** + ```plaintext + Actions: + - Skip infrastructure deployment + - Deploy new application version + - Run database migration + - Update LATEST_DEPLOYED_APPS_VERSION to v1.2.3 + ``` + +3. **After Deployment** + ```plaintext + New State: + - LATEST_DEPLOYED_INFRA_VERSION remains at v1.2.1 + - LATEST_DEPLOYED_APPS_VERSION updated to v1.2.3 + ``` diff --git a/docs/Infrastructure.md b/docs/Infrastructure.md new file mode 100644 index 000000000..c192458c2 --- /dev/null +++ b/docs/Infrastructure.md @@ -0,0 +1,179 @@ +# Infrastructure + +## Resource Naming + +All resources follow a consistent naming pattern: +- Prefix: `dp-be-{environment}` (dp = Dialogporten, be = Backend) +- Resource Group: `{prefix}-rg` +- Resources within the group append their type identifier: + - Key Vault: `{prefix}-kv` + - App Configuration: `{prefix}-appconfig` + - Application Insights: `{prefix}-ai` + - PostgreSQL: `{prefix}-psql` + - Service Bus: `{prefix}-sb` + - Virtual Network: `{prefix}-vnet` + - Redis Cache: `{prefix}-redis` + - SSH Jumper: `{prefix}-ssh-jumper` + +## Secret Management and Cross-Environment Configuration + +### Source Key Vault Pattern +The infrastructure uses a source Key Vault pattern for managing secrets across environments: + +1. **Source Key Vault Configuration** + - Subscription ID, Resource Group, and Name are passed as secure parameters + - Used as the central source of truth for cross-environment secrets + +2. **Secret Copying Pattern** + ``` + Source Key Vault -> Environment-specific Key Vault -> App Configuration + ``` + +3. **Environment-Specific Secrets** + - PostgreSQL passwords follow the pattern: `dialogportenPgAdminPassword{environment}` + - SSH public keys are stored in the source vault + - Secrets are conditionally copied based on existence in source vault + +4. **Secret Management Flow** + - Secrets are read from environment variables during deployment + - Copied to environment-specific Key Vaults + - Referenced by services using managed identities + - Some secrets are also copied to App Configuration for application use + +## Environment Configuration Patterns + +### Parameter Files +Each environment (`prod`, `staging`, `test`, `yt01`) has its own `.bicepparam` file containing: +1. Environment name +2. Location (norwayeast) +3. SKU configurations +4. Environment-specific object IDs +5. Environment URLs + +### Environment Variables +Required environment variables for deployment: +- `AZURE_KEY_VAULT_SOURCE_KEYS` +- `PG_ADMIN_PASSWORD` +- `AZURE_SOURCE_KEY_VAULT_SUBSCRIPTION_ID` +- `AZURE_SOURCE_KEY_VAULT_RESOURCE_GROUP` +- `AZURE_SOURCE_KEY_VAULT_NAME` +- `AZURE_SOURCE_KEY_VAULT_SSH_JUMPER_SSH_PUBLIC_KEY` + +## Tagging Convention + +All resources are tagged with: +```json +{ + "Environment": "", + "Product": "Dialogporten" +} +``` + +## Network Segmentation Pattern + +The Virtual Network follows a consistent subnet allocation pattern: +1. Default subnet: 10.0.0.0/24 +2. PostgreSQL subnet: 10.0.1.0/24 +3. Container Apps Environment: 10.0.2.0/23 +4. Service Bus subnet: 10.0.4.0/24 +5. Redis subnet: 10.0.5.0/24 + +## Security Patterns + +1. **Private Endpoint Pattern** + - All PaaS services use private endpoints + - Private DNS zones for each service type + - Private endpoint groups for service integration + +2. **Identity Management Pattern** + - System-assigned managed identities for services + - AAD group-based access control + - Different admin groups for prod/non-prod environments + +3. **Secret Rotation Pattern** + - Secrets stored in source Key Vault + - Copied to environment-specific vaults + - Referenced by services using managed identities + +## Monitoring Pattern + +1. **Application Insights Integration** + - Workspace-based deployment + - Availability tests for critical endpoints + - Optional immediate data purge after 30 days + +2. **PostgreSQL Monitoring** + - Index tuning (configurable per environment) + - Query performance insights (configurable per environment) + - Integration with Log Analytics workspace + +## High Availability Patterns + +### Production Environment +- PostgreSQL: Zone-redundant with standby in zone 2 +- Service Bus: Premium SKU with zone redundancy +- Container Apps: Multiple replicas across zones +- Redis: Basic SKU (consider upgrading for HA requirements) + +### Non-Production Environments +- Single zone deployments +- Reduced SKUs for cost optimization +- Shorter backup retention periods + +## Deploying a new infrastructure environment + +A few resources need to be created before we can apply the Bicep to create the main resources. + +The resources refer to a `source key vault` in order to fetch the necessary secrets and store them in the key vault for the environment. An `ssh`-key is also necessary for the `ssh-jumper` used to access the resources in Azure within the `vnet`. + +Use the following steps: + +- Ensure a `source key vault` exist for the new environment. Either create a new key vault or use an existing key vault. Currently, two key vaults exist for our environments. One in the test subscription used by Test and Staging, and one in our Production subscription, which Production uses. Ensure you add the necessary secrets that should be used by the new environment. Read here to learn about secret convention [Configuration Guide](Configuration.md). Ensure also that the key vault has the following enabled: `Azure Resource Manager for template deployment`. + +- Ensure that a role assignment `Key Vault Secrets User` and `Contributer`(should be inherited) is added for the service principal used by the GitHub Entra Application. + +- Create an SSH key in Azure and discard the private key. We will use the `az cli` to access the virtual machine so storing the `ssh key` is only a security risk. + +- Create a new environment in GitHub and add the following secrets: `AZURE_CLIENT_ID`, `AZURE_SOURCE_KEY_VAULT_NAME`, `AZURE_SOURCE_KEY_VAULT_RESOURCE_GROUP`, `AZURE_SOURCE_KEY_VAULT_SUBSCRIPTION_ID`, `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID` and `AZURE_SOURCE_KEY_VAULT_SSH_JUMPER_SSH_PUBLIC_KEY` + +- Add a new file for the environment `.azure/infrastructure/.bicepparam`. `` must match the environment created in GitHub. + +- Add the new environment in the `dispatch-infrastructure.yml` list of environments. + +- Run the GitHub action `Dispatch infrastructure` with the `version` you want to deploy and `environment`. All the resources in `.azure/infrastructure/main.bicep` should now be created. + +- (The GitHub action might need to restart because of a timeout when creating Redis). + +## Deploying applications in a new infrastructure environment + +Ensure you have followed the steps in [Deploying a new infrastructure environment](#deploying-a-new-infrastructure-environment) to have the resources required for the applications. + +Use the following steps: + +- From the infrastructure resources created, add the following GitHub secrets in the new environment (this will not be necessary in the future as secrets would be added directly from infrastructure deployment): `AZURE_APP_CONFIGURATION_NAME`, `AZURE_APP_INSIGHTS_CONNECTION_STRING`, `AZURE_CONTAINER_APP_ENVIRONMENT_NAME`, `AZURE_ENVIRONMENT_KEY_VAULT_NAME`, `AZURE_REDIS_NAME`, `AZURE_RESOURCE_GROUP_NAME`, `AZURE_SERVICE_BUS_NAMESPACE_NAME` and `AZURE_SLACK_NOTIFIER_FUNCTION_APP_NAME` + +- Add new parameter files for the environment in all applications `.azure/applications/*/.bicepparam` + +- Run the GitHub action `Dispatch applications` in order to deploy all applications to the new environment. + +- To expose the applications through APIM, see [Common APIM Guide](CommonAPIM.md) + +## Connecting to resources in Azure + +There is a `ssh-jumper` virtual machine deployed with the infrastructure. This can be used to create a `ssh`-tunnel into the `vnet`. There are two ways to establish connections: + +1. Using `az ssh` commands directly: + ```bash + # Connect to the VNet using: + az ssh vm --resource-group dp-be--rg --vm-name dp-be--ssh-jumper + + # Or create an SSH tunnel for specific resources (e.g., PostgreSQL database): + az ssh vm -g dp-be--rg -n dp-be--ssh-jumper -- -L 5432::5432 + ``` + This example forwards the PostgreSQL default port (5432) to your localhost. Adjust the ports and hostnames as needed for other resources. + + You may be prompted to install the ssh extension. + +2. Using the forwarding utility script: + + See [scripts/database-forwarder/README.md](../scripts/database-forwarder/README.md) for a more user-friendly way to establish database connections through SSH. \ No newline at end of file diff --git a/docs/Monitoring.md b/docs/Monitoring.md new file mode 100644 index 000000000..df67163d2 --- /dev/null +++ b/docs/Monitoring.md @@ -0,0 +1,97 @@ +# Monitoring + +## Overview Dashboard + +Dialogporten's monitoring dashboards are hosted in Grafana and provide comprehensive insights into system performance, health metrics, and operational status. The dashboards are accessible at [Grafana Altinn Cloud](https://grafana.altinn.cloud/dashboards/f/ce99lm57b1gcgd/). + +### Main Metrics +- **System Health**: Availability, request stats, latency +- **Container Apps**: CPU, memory, requests (GraphQL, Web APIs) +- **Infrastructure**: PostgreSQL, Redis, Service Bus status + +### Usage +- Select environment (test, yt01, staging, prod) +- Default view: Last 24 hours +- Start with system health, then drill down as needed + +## Telemetry Collection + +Dialogporten uses OpenTelemetry for collecting and routing telemetry data: + +### OpenTelemetry Integration +- Utilizes Azure Container Apps' managed OpenTelemetry agent +- Automatically collects traces and logs from container apps +- Routes telemetry data to Azure Application Insights +- Configured through Container Apps Environment settings + +### Data Flow +1. Applications emit OpenTelemetry-compliant telemetry +2. Container Apps OpenTelemetry agent collects the data +3. Data is sent to Azure Application Insights +4. Grafana visualizes the data through Azure Monitor data source + +### Implementation Details +- Traces and logs are configured to use Application Insights as destination +- Uses standard OpenTelemetry instrumentation for .NET +- Automatic correlation of distributed traces across services +- Custom metrics and traces can be added through the OpenTelemetry SDK + +## Redis Dashboard + +Detailed monitoring of Redis cache performance and health: + +### Key Metrics +- **Memory Usage**: Total and percentage used memory +- **Operations**: Commands executed, cache hits/misses +- **Keys**: Total keys, expired vs evicted keys +- **Connections**: Connected clients, server load +- **Performance**: Cache hit ratio, command processing rate + +### Usage +- Select subscription, environment, and Redis resource +- Default view: Last 24 hours +- Refresh interval: 30 seconds + +## Container Apps Dashboard + +Monitoring of Azure Container Apps deployments and performance: + +### Key Metrics +- **System Logs**: Container app system events and logs +- **Application Logs**: Service-specific application traces +- **Deployment Status**: Revision tracking and deployment logs + +### Usage +- Filter by service name and revision +- View logs by deployment or system events +- Track service-specific metrics and traces + +## Service Bus Dashboard + +Azure Service Bus monitoring for message processing: + +### Key Metrics +- **Queue/Topic Health**: Message counts, processing rates +- **Resource Usage**: Namespace metrics +- **Performance**: Throughput, latency, request rates + +### Usage +- Select namespace and queue/topic +- Monitor message processing status +- Track service bus resource utilization + +## PostgreSQL Dashboard + +Azure Database for PostgreSQL Flexible Server monitoring: + +### Key Metrics +- **Server Health**: CPU, memory, IOPS +- **Database Performance**: Connections, throughput +- **Storage**: Usage and performance metrics +- **Latency**: Query response times + +### Usage +- Select server instance and database +- Monitor resource utilization +- Track query performance and connections + From 1161c9f8d50f3b6f615e2928798ef1a2e7c6372a Mon Sep 17 00:00:00 2001 From: "renovate[bot]" <29139614+renovate[bot]@users.noreply.github.com> Date: Wed, 5 Mar 2025 12:25:24 +0100 Subject: [PATCH 2/4] chore(deps): update dependency bogus to 35.6.2 (#2004) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [Bogus](https://redirect.github.com/bchavez/Bogus) | `35.6.1` -> `35.6.2` | [![age](https://developer.mend.io/api/mc/badges/age/nuget/Bogus/35.6.2?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://developer.mend.io/api/mc/badges/adoption/nuget/Bogus/35.6.2?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://developer.mend.io/api/mc/badges/compatibility/nuget/Bogus/35.6.1/35.6.2?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://developer.mend.io/api/mc/badges/confidence/nuget/Bogus/35.6.1/35.6.2?slim=true)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
bchavez/Bogus (Bogus) ### [`v35.6.2`](https://redirect.github.com/bchavez/Bogus/blob/HEAD/HISTORY.md#v3562) Release Date: 2025-02-20 - PR 584: Pack LICENSE file with NuGet package. Also, use ProjectIcon. - Issue 581: Fix `Randomizer.ULong()` arithmetic overflow. Thanks [@​reuterma24](https://redirect.github.com/reuterma24)! - PR 586: Use .NET 9 SDK build tooling. Thanks [@​SimonCropp](https://redirect.github.com/SimonCropp)! - PR 587: Fix CS1584 incorrect use of cref in XML doc comment. Thanks [@​SimonCropp](https://redirect.github.com/SimonCropp)! - PR 589: Unlock ability to use any .NET SDK build tooling on AppVeyor. Thanks [@​SimonCropp](https://redirect.github.com/SimonCropp)!
--- ### Configuration 📅 **Schedule**: Branch creation - "before 7am on Sunday,before 7am on Wednesday" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/Altinn/dialogporten). Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> --- .../Digdir.Domain.Dialogporten.Infrastructure.csproj | 2 +- .../Digdir.Tool.Dialogporten.GenerateFakeData.csproj | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj b/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj index 6d592eb00..2bc36ec4c 100644 --- a/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj +++ b/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj @@ -7,7 +7,7 @@ - + diff --git a/src/Digdir.Tool.Dialogporten.GenerateFakeData/Digdir.Tool.Dialogporten.GenerateFakeData.csproj b/src/Digdir.Tool.Dialogporten.GenerateFakeData/Digdir.Tool.Dialogporten.GenerateFakeData.csproj index dac617cd3..09afc70cd 100644 --- a/src/Digdir.Tool.Dialogporten.GenerateFakeData/Digdir.Tool.Dialogporten.GenerateFakeData.csproj +++ b/src/Digdir.Tool.Dialogporten.GenerateFakeData/Digdir.Tool.Dialogporten.GenerateFakeData.csproj @@ -10,7 +10,7 @@ - + From 78a89fb74a212d3e14398e79f43e2d587da64a6f Mon Sep 17 00:00:00 2001 From: "renovate[bot]" <29139614+renovate[bot]@users.noreply.github.com> Date: Wed, 5 Mar 2025 12:25:36 +0100 Subject: [PATCH 3/4] chore(deps): update dependency htmlagilitypack to 1.11.73 (#2005) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [HtmlAgilityPack](http://html-agility-pack.net/) ([source](https://redirect.github.com/zzzprojects/html-agility-pack)) | `1.11.72` -> `1.11.73` | [![age](https://developer.mend.io/api/mc/badges/age/nuget/HtmlAgilityPack/1.11.73?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://developer.mend.io/api/mc/badges/adoption/nuget/HtmlAgilityPack/1.11.73?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://developer.mend.io/api/mc/badges/compatibility/nuget/HtmlAgilityPack/1.11.72/1.11.73?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://developer.mend.io/api/mc/badges/confidence/nuget/HtmlAgilityPack/1.11.72/1.11.73?slim=true)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
zzzprojects/html-agility-pack (HtmlAgilityPack) ### [`v1.11.73`](https://redirect.github.com/zzzprojects/html-agility-pack/releases/tag/v1.11.73) #### Download the library **[here](https://www.nuget.org/packages/HtmlAgilityPack/)** - **MERGED:** Fixes issues with colgroup and caption tags not being closed properly. Issues [#​584](https://redirect.github.com/zzzprojects/html-agility-pack/issues/584) and [#​583](https://redirect.github.com/zzzprojects/html-agility-pack/issues/583) (PR [#​585](https://redirect.github.com/zzzprojects/html-agility-pack/issues/585)) - **MERGED:** Make LoadFromWebAsync honor the Timeout property. Fixes [#​580](https://redirect.github.com/zzzprojects/html-agility-pack/issues/580) (PR [#​582](https://redirect.github.com/zzzprojects/html-agility-pack/issues/582)) *** #### Library Sponsored By This library is sponsored by [Entity Framework Extensions](https://entityframework-extensions.net/) Entity Framework Extensions
--- ### Configuration 📅 **Schedule**: Branch creation - "before 7am on Sunday,before 7am on Wednesday" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/Altinn/dialogporten). Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> --- .../Digdir.Domain.Dialogporten.Application.csproj | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj b/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj index 8b4cac3d8..40f79af0e 100644 --- a/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj +++ b/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj @@ -8,7 +8,7 @@ - + From d7938be80239a1a20cde9e7a9217655d94b81d5b Mon Sep 17 00:00:00 2001 From: "renovate[bot]" <29139614+renovate[bot]@users.noreply.github.com> Date: Wed, 5 Mar 2025 12:25:51 +0100 Subject: [PATCH 4/4] chore(deps): update npgsql dependencies to 9.0.3 (#2006) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [Npgsql](https://redirect.github.com/npgsql/npgsql) | `9.0.2` -> `9.0.3` | [![age](https://developer.mend.io/api/mc/badges/age/nuget/Npgsql/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://developer.mend.io/api/mc/badges/adoption/nuget/Npgsql/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://developer.mend.io/api/mc/badges/compatibility/nuget/Npgsql/9.0.2/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://developer.mend.io/api/mc/badges/confidence/nuget/Npgsql/9.0.2/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | | [Npgsql.OpenTelemetry](https://redirect.github.com/npgsql/npgsql) | `9.0.2` -> `9.0.3` | [![age](https://developer.mend.io/api/mc/badges/age/nuget/Npgsql.OpenTelemetry/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://developer.mend.io/api/mc/badges/adoption/nuget/Npgsql.OpenTelemetry/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://developer.mend.io/api/mc/badges/compatibility/nuget/Npgsql.OpenTelemetry/9.0.2/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://developer.mend.io/api/mc/badges/confidence/nuget/Npgsql.OpenTelemetry/9.0.2/9.0.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes
npgsql/npgsql (Npgsql) ### [`v9.0.3`](https://redirect.github.com/npgsql/npgsql/releases/tag/v9.0.3) v9.0.3 contains several minor bug fixes. [Milestone issues](https://redirect.github.com/npgsql/npgsql/milestone/126?closed=1) **Full Changelog**: https://github.com/npgsql/npgsql/compare/v9.0.2...v9.0.3
--- ### Configuration 📅 **Schedule**: Branch creation - "before 7am on Sunday,before 7am on Wednesday" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/Altinn/dialogporten). Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> --- .../Digdir.Domain.Dialogporten.Application.csproj | 2 +- .../Digdir.Domain.Dialogporten.Infrastructure.csproj | 2 +- .../Digdir.Library.Utils.AspNet.csproj | 2 +- .../Digdir.Tool.Dialogporten.LargeDataSetGenerator.csproj | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj b/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj index 40f79af0e..b209c937d 100644 --- a/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj +++ b/src/Digdir.Domain.Dialogporten.Application/Digdir.Domain.Dialogporten.Application.csproj @@ -12,7 +12,7 @@ - + diff --git a/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj b/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj index 2bc36ec4c..fd22d474f 100644 --- a/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj +++ b/src/Digdir.Domain.Dialogporten.Infrastructure/Digdir.Domain.Dialogporten.Infrastructure.csproj @@ -10,7 +10,7 @@ - + all diff --git a/src/Digdir.Library.Utils.AspNet/Digdir.Library.Utils.AspNet.csproj b/src/Digdir.Library.Utils.AspNet/Digdir.Library.Utils.AspNet.csproj index c4dae965a..4ebc7d77d 100644 --- a/src/Digdir.Library.Utils.AspNet/Digdir.Library.Utils.AspNet.csproj +++ b/src/Digdir.Library.Utils.AspNet/Digdir.Library.Utils.AspNet.csproj @@ -15,7 +15,7 @@ - + diff --git a/src/Digdir.Tool.Dialogporten.LargeDataSetGenerator/Digdir.Tool.Dialogporten.LargeDataSetGenerator.csproj b/src/Digdir.Tool.Dialogporten.LargeDataSetGenerator/Digdir.Tool.Dialogporten.LargeDataSetGenerator.csproj index 7098d98af..899271c3c 100644 --- a/src/Digdir.Tool.Dialogporten.LargeDataSetGenerator/Digdir.Tool.Dialogporten.LargeDataSetGenerator.csproj +++ b/src/Digdir.Tool.Dialogporten.LargeDataSetGenerator/Digdir.Tool.Dialogporten.LargeDataSetGenerator.csproj @@ -6,7 +6,7 @@ - +