Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM-4079 chore: Migrate release automation to GitLab #1945

Merged
merged 4 commits into from
Jul 10, 2024

Conversation

ncreated
Copy link
Member

@ncreated ncreated commented Jul 5, 2024

What and why?

πŸ“¦ 🧰 This PR migrates release automation to GitLab, following PRs #1921 and #1910.

Remaining automations in Bitrise:

  • ❌ SR snapshot tests - will be migrated next
  • ❌ Dogfooding automation
  • ❌ E2E tests app upload to s8s

How?

Overview

The previous release.py (Python) automation has been replaced with make + shell scripts, retaining the same release model. I made few improvements to the failure resistance of publishing Cocoapods podspecs, which are prone to timeouts. Failed CP jobs can now be safely retried, and already pushed specs will be skipped.

GitLab Release Pipeline

The release pipeline consists of 5 sequential jobs:

  1. Build Artifacts - Creates and validates two artifacts for the next jobs:
    • artifacts/<tag>/Datadog.xcframework.zip - XCFrameworks bundle for GH Release;
    • artifacts/<tag>/dd-sdk-ios - fresh clone of the repo for linting and publishing Cocoapods podspecs.
  2. Publish GH Asset - Publishes Datadog.xcframework.zip to GH Release.
  3. Publish CP podspecs (internal) - Lints and pushes DatadogInternal.podspec to trunk, starting only after (2) succeeds to avoid leaving the release in an inconsistent state.
  4. Publish CP podspecs (dependent) - Lints and pushes main SDK podspecs (Core, Logs, Trace, RUM, SR, CR, and WVT) 20 minutes after (3) succeeds, anticipating the availability of DatadogInternal.podspec in the CP trunk.
  5. Publish CP podspecs (legacy) - Lints and pushes legacy podspecs (for backward compatibility with V1), including DatadogObjc.podspec, 20 minutes after (4) to ensure main podspecs are available.

All "publish" jobs use the needs syntax for dependencies.

Starting Release Pipeline Manually (DRY_RUN)

For debugging and troubleshooting, the release pipeline can be manually started with RELEASE_GIT_TAG and RELEASE_DRY_RUN CI env variables. Setting RELEASE_DRY_RUN=1 runs the entire pipeline but skips publishing artifacts. RELEASE_GIT_TAG can point to any existing release tag for testing.

CI Secrets

Introducing this automation required implementing the flow for managing CI secrets. Following internal guidelines, secrets are managed using Vault. Vault is already configured in CI runners. To authenticate it on local machines, I added convenient automation in tools/secrets accessible through make set-ci-secret:

It uses OpenID to authorize operations through a web browser and existing Google account sessions. Everyone on the team should already be authorized to manage CI secrets. Follow this guide (internal) for more info.

Building XCFrameworks

The tools/release/build-xcframeworks.sh added in this PR is an enhanced version of the previous build-xcframework.sh script. The core logic remains unchanged, but it now supports --ios and --tvos options for platform-specific slices. While both flags are used in the release pipeline, they are used separately in iOS and tvOS smoke tests, improving performance:

  • Smoke Tests (iOS) is now 3 minutes faster 🏎️ (~22min -> ~19min)
  • Smoke Tests (tvOS) is now 6 minutes faster 🏎️ (~19min -> ~13min)

🎁 Extra bits / not done in this PR

No Slack notifications are implemented in the release pipeline yet. I plan to work on this separately, implementing notifications for all key pipelines.

Future plans include automating:

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference
  • Add CHANGELOG entry for user facing changes

Custom CI job configuration (optional)

  • Run unit tests for Session Replay

@ncreated ncreated self-assigned this Jul 5, 2024
@ncreated ncreated marked this pull request as ready for review July 5, 2024 15:39
@ncreated ncreated requested review from a team as code owners July 5, 2024 15:39
Copy link
Contributor

@ganeshnj ganeshnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments but overall right on the track. Great continuation.

rm -rf "$ARTIFACTS_PATH"
mkdir -p "$ARTIFACTS_PATH"

clone_repo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we clean the repo state rather cloning a new copy?

Essentially git clean -fxd

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can, but this would significantly change the existing release model introduced to Bitrise in this PR. While migrating things to GitLab I wanted to keep it simple, not exposed to new challenges.

In #636, we switched from using "cleanup" scripts to performing a fresh clone of the repo for two reasons:

  1. Overlooking some dirty state during "cleanup" led to shipping unnecessary content in three release artifacts.
  2. There was no CI-automated way to re-build artifacts for past releases using the new tooling.

While git clean -fxd is definitely better than the custom "cleanup" we used to have before #636, performing a release from the current repo doesn't offer the same level of flexibility in CI. Specifically:

  • With the current model, it is possible to test the entire release pipeline on any branch by targeting an existing tag (using RELEASE_GIT_TAG and RELEASE_DRY_RUN options). I leveraged this several times to test this PR.
  • In rare situations, we had to modify release tooling and re-build artifacts for existing tags using a new version of Xcode on CI (possible by using RELEASE_GIT_TAG=<past release>, RELEASE_DRY_RUN=0, and OVERWRITE_EXISTING=1).

Likely, we don't need such a level of CI flexibility these days ☝️, as module stability was finally fixed in 2.x.

However, cleaning the current repo for developing release tooling in local might be very cumbersome and impractical, given that we track a lot of dev configuration through .local.xcconfig files (which would be wiped out). Performing a fresh clone of the repo into artifacts/<tag>/ is more convenient and less invasive.

Let me know how does it sound @ganeshnj

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the use case but I don't think it is a good approach to have something released other than what the current branch is triggered from. If we are building such a functionality which means this is not the right separation and these tools must reside somewhere else. This is anti CI/CD pattern.

Let's not block the PR but I would want to release the changes which triggered the workflow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I don't think it is a good approach to have something released other than what the current branch is triggered from

This is definitely not the case here. For tag triggers, the flow is this:

  • GitLab clones the repo to ~/dd-sdk-ios for given GIT_TAG
  • ~/dd-sdk-ios/tools/release/build.sh starts
  • it clones the repo for the same GIT_TAG into ~/dd-sdk-ios/artifacts/<GIT_TAG>/dd-sdk-ios
  • finally, ~/dd-sdk-ios/tools/release/ automation:
    • builds and publishes XCFramework for ~/dd-sdk-ios/artifacts/<GIT_TAG>/dd-sdk-ios
    • publishes Cocoapod podspecs for ~/dd-sdk-ios/artifacts/<GIT_TAG>/dd-sdk-ios

Ultimately, the automation will release exactly the hash that it was triggered for, so we can't speak of anti-patterns.

Then, I agree that this implementation detail might look custom, but it was made intentionally to address concrete problems we hit when releasing SDK that supports several iOS dependency managers, each operating in different model.

Let's not block the PR

πŸ‘ Also note that all tools/release/ scripts introduced in this PR do support path parameters making the release of . instead of artifacts/<GIT_TAG>/dd-sdk-ios very simple to achieve.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarification.

ganeshnj
ganeshnj previously approved these changes Jul 9, 2024
ncreated added 3 commits July 10, 2024 09:19
- prefer shell scripts over python automation
- deintegrate Bitrise config
- cleanup release.py
@ncreated ncreated force-pushed the ncreated/RUM-4079/migrate-release-automation branch from 05aedd2 to 0458bd1 Compare July 10, 2024 07:26
@ncreated ncreated force-pushed the ncreated/RUM-4079/migrate-release-automation branch from 0458bd1 to 3a05078 Compare July 10, 2024 08:30
@ncreated ncreated requested a review from ganeshnj July 10, 2024 09:09
@ncreated
Copy link
Member Author

@ncreated ncreated merged commit 2f9a7df into develop Jul 10, 2024
15 checks passed
@ncreated ncreated deleted the ncreated/RUM-4079/migrate-release-automation branch July 10, 2024 09:33
@maciejburda maciejburda mentioned this pull request Jul 10, 2024
4 tasks
@ncreated ncreated mentioned this pull request Jul 25, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants