Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply fails on Gitlab 16.4.0 with "Pull request must be mergeable before running apply." #3722

Open
syst0m opened this issue Aug 31, 2023 · 25 comments
Labels
bug Something isn't working Stale

Comments

@syst0m
Copy link

syst0m commented Aug 31, 2023

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

Running atlantis apply on an approved Gitlab PR fails intermittently with Apply Failed: Pull request must be mergeable before running apply..

Reproduction Steps

  1. Run atlantis apply on approved PR
  2. Atlantis comments with error: **Apply Failed**: Pull request must be mergeable before running apply.

Logs

Logs
August 23, 2023 at 17:03 (UTC+1:00)	{"level":"error","ts":"2023-08-23T16:03:28.369Z","caller":"events/instrumented_project_command_runner.go:84","msg":"Failure running apply operation: Pull request must be mergeable before running apply.","json":{"repo":"ume-platform-engineering/tf-env-data-engineering","pull":"189"},"stacktrace":"github.com/runatlantis/atlantis/server/events.RunAndEmitStats\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/events/instrumented_project_command_runner.go:84\ngithub.jparrowsec.cn/runatlantis/atlantis/server/events.(*InstrumentedProjectCommandRunner).Apply\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/events/instrumented_project_command_runner.go:46\ngithub.jparrowsec.cn/runatlantis/atlantis/server/events.runProjectCmds\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/events/project_command_pool_executor.go:48\ngithub.jparrowsec.cn/runatlantis/atlantis/server/events.(*ApplyCommandRunner).Run\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/events/apply_command_runner.go:166\ngithub.jparrowsec.cn/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/events/command_runner.go:298"}
August 23, 2023 at 17:03 (UTC+1:00)	{"level":"info","ts":"2023-08-23T16:03:29.799Z","caller":"events/automerger.go:20","msg":"not automerging because project at dir \"XXX\", workspace \"default\" has status \"apply_errored\"","json":{"repo":"XXX","pull":"189"}}
August 23, 2023 at 17:03 (UTC+1:00)	{"level":"error","ts":"2023-08-23T16:03:32.912Z","caller":"logging/simple_logger.go:163","msg":"invalid key: e4d66b69-e7c7-495a-87fd-20f677f74b13","json":{},"stacktrace":"github.com/runatlantis/atlantis/server/logging.(*StructuredLogger).Log\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/logging/simple_logger.go:163\ngithub.jparrowsec.cn/runatlantis/atlantis/server/controllers.(*JobsController).respond\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/jobs_controller.go:92\ngithub.jparrowsec.cn/runatlantis/atlantis/server/controllers.(*JobsController).getProjectJobsWS\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/jobs_controller.go:70\ngithub.jparrowsec.cn/runatlantis/atlantis/server/controllers.(*JobsController).GetProjectJobsWS\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/jobs_controller.go:83\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2122\ngithub.jparrowsec.cn/gorilla/mux.(*Router).ServeHTTP\n\tgithub.jparrowsec.cn/gorilla/[email protected]/mux.go:210\ngithub.jparrowsec.cn/urfave/negroni/v3.Wrap.func1\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:59\ngithub.jparrowsec.cn/urfave/negroni/v3.HandlerFunc.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:33\ngithub.jparrowsec.cn/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:51\ngithub.jparrowsec.cn/runatlantis/atlantis/server.(*RequestLogger).ServeHTTP\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/middleware.go:70\ngithub.jparrowsec.cn/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:51\ngithub.jparrowsec.cn/urfave/negroni/v3.(*Recovery).ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/recovery.go:210\ngithub.jparrowsec.cn/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:51\ngithub.jparrowsec.cn/urfave/negroni/v3.(*Negroni).ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:111\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2936\nnet/http.(*conn).serve\n\tnet/http/server.go:1995"}

Environment details

  • Atlantis version: v0.23.2 & v0.23.4
  • Deployment method: ecs
  • If not running the latest Atlantis version have you tried to reproduce this issue on the latest version: no
  • Gitlab 16.4.0

Atlantis server-side config file:

{
    "taskDefinitionArn": "arn:aws:ecs:XXX:XXX",
    "containerDefinitions": [
        {
            "name": "atlantis",
            "image": "XXX/atlantis:v0.24.3-0bcdeaa1",
            "cpu": 2048,
            "memory": 4096,
            "memoryReservation": 128,
            "portMappings": [
                {
                    "containerPort": 4141,
                    "hostPort": 4141,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "ATLANTIS_ALLOW_REPO_CONFIG",
                    "value": "false"
                },
                {
                    "name": "ATLANTIS_HIDE_PREV_PLAN_COMMENTS",
                    "value": "false"
                },
                {
                    "name": "ATLANTIS_WRITE_GIT_CREDS",
                    "value": "true"
                },
                {
                    "name": "ATLANTIS_SILENCE_NO_PROJECTS",
                    "value": "false"
                },
                {
                    "name": "ATLANTIS_GITLAB_USER",
                    "value": "XXX"
                },
                {
                    "name": "ATLANTIS_LOG_LEVEL",
                    "value": "debug"
                },
                {
                    "name": "ATLANTIS_AUTOMERGE",
                    "value": "true"
                },
                {
                    "name": "ATLANTIS_BITBUCKET_USER",
                    "value": ""
                },
                {
                    "name": "ATLANTIS_REPO_CONFIG_JSON",
                    "value": "{\"repos\":[{\"allow_custom_workflows\":true,\"allowed_overrides\":[\"workflow\"],\"apply_requirements\":[\"undiverged\",\"mergeable\",\"approved\"],\"id\":\"/.*/\",\"repo_config_file\":\"atlantis.yaml\"}],\"workflows\":{\"default\":{\"apply\":{\"steps\":[\"apply\",{\"run\":\"[ ! -z \\\"$PROJECT_NAME\\\" ] \\u0026\\u0026 export TAG=$PROJECT_NAME || export TAG=last-applied \\u0026\\u0026 git config --global user.name XXXn \\u0026\\u0026 git config --global user.email XXX \\u0026\\u0026 git fetch --tags -f \\u0026\\u0026 git fetch --all --tags \\u0026\\u0026 (git tag --delete $TAG || true) \\u0026\\u0026 git tag -a $TAG -m \\\"Tagged automatically by atlantis\\\" \\u0026\\u0026 (git push origin --delete $TAG || true) \\u0026\\u0026 git push origin $TAG\"}]},\"plan\":{\"steps\":[\"init\",\"plan\"]}}}}"
                },
                {
                    "name": "ATLANTIS_PARALLEL_POOL_SIZE",
                    "value": "50"
                },
                {
                    "name": "ATLANTIS_REPO_ALLOWLIST",
                    "value": "XXX"
                },
                {
                    "name": "ATLANTIS_GITLAB_HOSTNAME",
                    "value": "gitlab.com"
                },
                {
                    "name": "ATLANTIS_DEFAULT_TF_VERSION",
                    "value": "v0.13.4"
                },
                {
                    "name": "ATLANTIS_GH_APP_ID",
                    "value": ""
                },
                {
                    "name": "ATLANTIS_BITBUCKET_BASE_URL",
                    "value": ""
                },
                {
                    "name": "ATLANTIS_PORT",
                    "value": "4141"
                },
                {
                    "name": "ATLANTIS_GH_USER",
                    "value": ""
                },
                {
                    "name": "ATLANTIS_ATLANTIS_URL",
                    "value": "https://XXX"
                }
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "secrets": [
XXX
]

Repo atlantis.yaml file:

version: 3
projects:

  - name: platform
    dir: XXX/platform
    autoplan:
      when_modified: [ "*.tf" ]
      enabled: true
...

Additional Context

Issue started happening on Atlantis v0.23.2. Still happening after upgrading to v0.23.4.

@syst0m syst0m added the bug Something isn't working label Aug 31, 2023
@jamengual
Copy link
Contributor

jamengual commented Aug 31, 2023 via email

@saraangelmurphy
Copy link

saraangelmurphy commented Sep 26, 2023

Hi, my organization is also experiencing this issue after upgrading to Gitlab v16.4.0. It affects the latest version of atlantis (v0.25.0), as well as version v0.24.2. The only workaround is to disable required approvals altogether in repos.yaml, which of course is not ideal.

Overview of the Issue

When executing atlantis apply in a comment for a gitlab merge request, atlantis fails with the message "Apply Failed: Pull request must be mergeable before running apply"

Reproduction Steps

  1. Create a gitlab v16.4.0 instance with a webhook to communicate with atlantis
  2. Create a merge request to a repository, with atlantis' repos.yaml file configured to require approvals
  3. Obtain a merge request approval
  4. Execute atlantis apply
  5. atlantis fails with the message "Apply Failed: Pull request must be mergeable before running apply"

Logs

From systemctl status atlantis:

Sep 26 16:28:26 cit-prod-atlantis atlantis[13665]: {"level":"warn","ts":"2023-09-26T16:28:26.049Z","caller":"events/apply_command_runner.go:111","msg":"unable to get pull request status: fetching mergeability status for repo: tf-infrastructure-gcp/bv-infra, and pull number: 1026: GET https://gitlab.c.bvnt.co/api/v4/projects/274: 500 {message: 500 Internal Server Error}. Continuing with mergeable and approved assumed false","json":{"repo":"tf-infrastructure-gcp/bv-infra","pull":"1026"},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*ApplyCommandRunner).Run\n\t/home/runner/work/atlantis/atlantis/server/events/apply_command_runner.go:111\ngithub.jparrowsec.cn/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\t/home/runner/work/atlantis/atlantis/server/events/command_runner.go:298"}

Environment details

  • Atlantis version: v0.25.0
  • Deployment method: binary on EC2 AWS isntance
  • If not running the latest Atlantis version have you tried to reproduce this issue on the latest version: yes
  • Atlantis flags: /home/atlantis/atlantis server --config /home/atlantis/atlantis.yaml

Atlantis server-side config file:

root@cit-prod-atlantis:/home/atlantis# cat atlantis.yaml
---
# Minimum required settings
atlantis-url: https://atlantis.c.bvnt.co:4141
gitlab-hostname: gitlab.c.bvnt.co
gitlab-token: '<SNIP>'
gitlab-user: atlantisbot
gitlab-webhook-secret: '<SNIP>'
repo-whitelist: 'gitlab.c.bvnt.co/*'
# Optional Settings
data-dir: /home/atlantis/.atlantis
default-tf-version: v0.11.15
port: 4141
repo-config: /home/atlantis/repos.yaml
ssl-cert-file: /home/atlantis/ssl/wc.c.bvnt.co.crt
ssl-key-file: /home/atlantis/ssl/wc.c.bvnt.co.key
# This setting means that atlantis will merge the destination branch into the source before running plans
checkout-strategy: merge
...

Repo atlantis.yaml file:

---
version: 3
projects:
- dir: prod/gcp-hashicorp-vault
  workflow: vault
  terraform_version: v1.5.5
  autoplan:
    when_modified: ["*.tf"]
.... # Repeated 3-400 times
.... # this also occurs in repos with only one or two atlantis "projects"

repos.yaml file

root@cit-prod-atlantis:/home/atlantis# cat repos.yaml
repos:
- id: /.*/
  apply_requirements: [approved] # This is the line we removed to work around this issue
  allowed_overrides: [workflow]

# Global workflows
workflows:
  default:
    plan:
      steps:
        - run: /home/atlantis/atlantis-verify.sh plan $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - init
        - plan
        - run: /home/atlantis/bv-conftest.sh
    apply:
      steps:
        - run: /home/atlantis/atlantis-verify.sh apply $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - apply
  vault:
    plan:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh plan $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - init:
            extra_args: ["-reconfigure"]
        - plan
        - run: /home/atlantis/bv-conftest.sh
    apply:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh apply $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - apply
  tfvars:
    plan:
      steps:
        - run: /home/atlantis/atlantis-verify.sh plan $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - run: /bin/cp "$DIR/_tfvars/$WORKSPACE.tfvars" "$DIR/temp-tfvars-file-copy.auto.tfvars"
        - init
        - plan
        - run: /home/atlantis/bv-conftest.sh
    apply:
      steps:
        - run: /home/atlantis/atlantis-verify.sh apply $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - apply
  tfvars_vault:
    plan:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh plan $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - run: /bin/cp "$DIR/_tfvars/$WORKSPACE.tfvars" "$DIR/temp-tfvars-file-copy.auto.tfvars"
        - init
        - plan
        - run: /home/atlantis/bv-conftest.sh
    apply:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh apply $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - apply
  global_sgrules:
    plan:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh plan $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - init
        - plan
    apply:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh apply $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - apply:
            extra_args: ['--parallelism=1']
  vaultforbluevoyantproduction:
    plan:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh plan $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - init:
            extra_args: ["-reconfigure"]
        - plan
        - run: /home/atlantis/bv-conftest.sh
    apply:
      steps:
        - env:
            name: VAULT_TOKEN
            command: /home/atlantis/vault-login.sh
        - run: /home/atlantis/atlantis-verify.sh apply $COMMENT_ARGS $BASE_REPO_OWNER $BASE_REPO_NAME
        - apply

Additional Context

#3277 seems related, but has not fixed the problem fully.

@syst0m
Copy link
Author

syst0m commented Sep 26, 2023

@jamengual Any updates on this? Thanks!

I will try 0.25.0 and see.

@jamengual
Copy link
Contributor

I do not use Gitlab sadly, never got to try this @lukemassa do you had this issue?

@lukemassa
Copy link
Contributor

I don't, but if I had to guess I'd say it's related to the fact that gitlab's determination of whether an MR is "mergeable" is, as far as I understand it, asynchronous, so there might be times where if atlantis catches it at the wrong time it's still "figuring out" whether it's mergeable. Let me see if I can reproduce this on my setup and get back to you

@lukemassa
Copy link
Contributor

lukemassa commented Sep 27, 2023

So I took a look and I am unable to reproduce this on either 0.24.2, 0.25.0 or main. However, my company's gitlab is on v16.2.3-ee, whereas the reports says 16.4.0. Additionally, a line in the logs pasted above jumped out at me which is:

GET https://gitlab.c.bvnt.co/api/v4/projects/274: 500 {message: 500 Internal Server Error}

This makes me think there's a gitlab bug, not an atlantis bug.

@saraangelmurphy What happens when you go to https://gitlab.c.bvnt.co/api/v4/projects/274 in your browser? In my case the analogous URL gets me something that looks like:

{"id":17829,"description":null,"name":"Lmassa Test Atlantis" ...

Also for what it's worth I think this is the line that's failing: https://github.com/runatlantis/atlantis/blob/main/server/events/vcs/gitlab_client.go#L293

Otherwise, I'm happy to dig into this bug when my company upgrades their gitlab instance to 16.4.0.

@infernaltechnology
Copy link

Given the critical vulnerability in previous versions, I hope that you guys prioritize the upgrade

https://nvd.nist.gov/vuln/detail/CVE-2023-5009

@saraangelmurphy
Copy link

@lukemassa you are entirely correct, that URL does return a 500 error in gitlab 16.4.0. We've raised a ticket with them, but it does certainly appear to be a regression on their end if the projects API works fine on 16.2.X.

@jamengual
Copy link
Contributor

ohhhh Gitlab.....is still better than Bitbucket

Let's see what they say in your ticket, if you can link it here @saraangelmurphy that will be useful.

@syst0m
Copy link
Author

syst0m commented Sep 29, 2023

So I took a look and I am unable to reproduce this on either 0.24.2, 0.25.0 or main. However, my company's gitlab is on v16.2.3-ee, whereas the reports says 16.4.0. Additionally, a line in the logs pasted above jumped out at me which is:

GET https://gitlab.c.bvnt.co/api/v4/projects/274: 500 {message: 500 Internal Server Error}

This makes me think there's a gitlab bug, not an atlantis bug.

@saraangelmurphy What happens when you go to https://gitlab.c.bvnt.co/api/v4/projects/274 in your browser? In my case the analogous URL gets me something that looks like:

{"id":17829,"description":null,"name":"Lmassa Test Atlantis" ...

Also for what it's worth I think this is the line that's failing: https://github.com/runatlantis/atlantis/blob/main/server/events/vcs/gitlab_client.go#L293

Otherwise, I'm happy to dig into this bug when my company upgrades their gitlab instance to 16.4.0.

I don't see a 500 response in the logs, like @saraangelmurphy does.

@lukemassa
Copy link
Contributor

I was able to to get my hands on a server with 16.4.0 and was unfortunately unable to reproduce either the original issue or the 500 issue. Especially with the original issue being intermittent makes it tricky. Any more logs or attempts to isolate would be helpful.

@saraangelmurphy
Copy link

saraangelmurphy commented Oct 3, 2023

So this ended up being an issue with an incomplete upgrade to gitlab, that was due to SQL migration scripts failing during the upgrade when applying constraints to gitlab project regex rules. Specifically, regexes needed to be <521 characters, we had several that were over, the presence of these project push rules during the upgrade broke the projects API.

once we removed the excessively lengthy push rule regexes and reran the migrations, the projects API started responding again and atlantis was happy.

https://gitlab.com/gitlab-org/gitlab/-/issues/426066#note_1575868016

I also want to applaud @lukemassa @jamengual , and the other maintainers of atlantis! While my thanks is poor reward compared to financial support, the alacrity of responses here is better than we get from many paid projects. Atlantis is a fantastic project, and we are profoundly grateful for everyone's hard work making this a production-ready service that solves many needs.

@jamengual
Copy link
Contributor

I'm so glad to hear you solved the issue and that we were able to help but I have to say that without @lukemassa we could not have done it, we need more people like that in this project and the world too.

@lukemassa
Copy link
Contributor

@saraangelmurphy that's great to hear!

@syst0m any chance that saraangelmurphy's was helpful in addressing your issue? If not let us know if you have any luck isolating the issue or finding a way to reproduce it consistently.

@syst0m
Copy link
Author

syst0m commented Oct 5, 2023

@saraangelmurphy that's great to hear!

@syst0m any chance that saraangelmurphy's was helpful in addressing your issue? If not let us know if you have any luck isolating the issue or finding a way to reproduce it consistently.

I'm afraid not, looks like they are running a self-managed gitlab instances, IIUC.
We are using the hosted Gitlab version, and have no control of the upgrade process, and don't use any SQL migration scripts.

In terms of trying to reproduce the issue on my end, I haven't been able to consistently reproduce it, I'm afraid. 😞

@lukemassa
Copy link
Contributor

My company's gitlab instance is now on 16.4.0, and I'm running atlantis 0.24.4, and I've not yet experienced or had reports of unmergeability for approval-required MRs. If anyone else is experiencing this, let us know and hopefully we can figure out some commonalities!

@syst0m
Copy link
Author

syst0m commented Oct 19, 2023

@lukemassa We are still experiencing this issue, with multiple atlantis instances, which are handling different repos.
I'm attaching logs from 3 different occurrences of the issue.
The 1st log file contains some errors, which don't seem to be logged during the other 2 occurrences.
Not sure if they are related to the issue or not.

log_1.txt
log_2.txt
log_3.txt

| 1697556588198 | {"level":"warn","ts":"2023-10-17T15:29:48.198Z","caller":"events/events_controller.go:572","msg":"Failed to react to comment: POST https://gitlab.com/api/v4/projects/ume-platform-engineering/tf-env-data-platform/merge_requests/254/notes/1607452833/award_emoji: 404 {message: 404 Award Emoji Name has already been taken Not Found}","json":{},"stacktrace":"github.com/runatlantis/atlantis/server/controllers/events.(*VCSEventsController).handleCommentEvent\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/events/events_controller.go:572\ngithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/events.(*VCSEventsController).HandleGitlabCommentEvent\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/events/events_controller.go:523\ngithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/events.(*VCSEventsController).handleGitlabPost\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/events/events_controller.go:501\ngithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/events.(*VCSEventsController).Post\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/controllers/events/events_controller.go:112\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2122\ngithub.jparrowsec.cn/gorilla/mux.(*Router).ServeHTTP\n\tgithub.jparrowsec.cn/gorilla/[email protected]/mux.go:210\ngithub.jparrowsec.cn/urfave/negroni/v3.Wrap.func1\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:59\ngithub.jparrowsec.cn/urfave/negroni/v3.HandlerFunc.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:33\ngithub.jparrowsec.cn/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:51\ngithub.jparrowsec.cn/runatlantis/atlantis/server.(*RequestLogger).ServeHTTP\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/middleware.go:70\ngithub.jparrowsec.cn/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:51\ngithub.jparrowsec.cn/urfave/negroni/v3.(*Recovery).ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/recovery.go:210\ngithub.jparrowsec.cn/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:51\ngithub.jparrowsec.cn/urfave/negroni/v3.(*Negroni).ServeHTTP\n\tgithub.jparrowsec.cn/urfave/negroni/[email protected]/negroni.go:111\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2936\nnet/http.(*conn).serve\n\tnet/http/server.go:1995"} | ecs/atlantis-platform-engineering/86ccbbbce04c45dcbfdfca6d39ac9a92 |
| 1697556589543 | {"level":"warn","ts":"2023-10-17T15:29:49.543Z","caller":"events/apply_command_runner.go:97","msg":"unable to update commit status: POST https://gitlab.com/api/v4/projects/ume-platform-engineering/tf-env-data-platform/statuses/4f602f33fb842b9298b646f6776799725abb5b8f: 400 {message: Cannot transition status via :run from :running (Reason(s): Status cannot transition via \"run\")}","json":{"repo":"ume-platform-engineering/tf-env-data-platform","pull":"254"},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*ApplyCommandRunner).Run\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/events/apply_command_runner.go:97\ngithub.jparrowsec.cn/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.jparrowsec.cn/runatlantis/atlantis/server/events/command_runner.go:298"}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | ecs/atlantis-platform-engineering/86ccbbbce04c45dcbfdfca6d39ac9a92 |

@lukemassa
Copy link
Contributor

Yeah I don't see any smoking gun there, hard to say why exactly atlantis thinks the MRs are not mergeable.

If you're able to run from source, @X-Guardian recently added a lot of debug logs #3876, including to the PullIsMergeable function that I think would really help us narrow down why the code thinks these are not mergeable. If not, we may have to wait until the next release to get more information here.

@lukemassa
Copy link
Contributor

@syst0m 0.27.0 has been released which adds a number of debug flags to gitlab calls, when you have a chance could you upgrade and try to reproduce?

@dougbw
Copy link

dougbw commented Jan 23, 2024

I work with @syst0m and this is still an issue for us. I have been doing some debugging when I have ran in to this.

I noticed that when this occurs, if you look at the Pipelines tab of the MR it shows that the status of the atlantis plan is still in a Running state although this is clearly not the case the plan has been posted back to the MR as a comment and the main MR page shows the status of the same atlantis plan as Success.

I have an example of this below, and the steps leading up to this was just our standard workflow with no complications:

  • Create feature branch
  • Push commit
  • Raise MR
  • Atlantis plan is executed automatically and the comment/status all looks correct on MR summary page
  • MR is approved
  • atlantis apply comment
  • Apply Failed: Pull request must be mergeable before running apply. error message posted back to MR

MR Summary:
atlantis-mr-summary

MR Pipeline tab:
atlantis-mr-pipeline

In the above case running subsequent atlantis plan / atlantis apply commands on the MR does not resolve the issue (people are generally working around this by pushing new commits).

With this in mind I was exploring the gitlab commit status API and have written a script to manually update the commit status of the "stuck" atlantis plan job to "success" and this allows me to fix a stuck MR to work around the issue, so it feels like the issue is caused by atlantis not be posting the commit status back correctly, either the wrong request is being sent or the request is failing for some reason (e.g network / gitlab transient issues / rate limiting etc).

I thought I might be able to do the reverse this workaround of this to reliably re-produce the problem (raise an MR, then manually set the atlantis plan status to "Running") but the behaviour wasn't identical - it still blocked the MR with the Pull request must be mergeable before running apply. message but unlike the above example we could work around the issue by issuing atlantis plan / atlantis apply commands so I feel like I am missing something in terms of where in gitlab that atlantis pushes the statuses to.

@oana-l
Copy link

oana-l commented Jan 25, 2024

@lukemassa Hello,
Please see attached logs, we've upgraded to v0.27.1 atlantis and reproduced the issue initially reported by @syst0m . I really hope we can get some clarification on why this is happening.
Thanks in advance!
atlantis_logs.csv

@lukemassa
Copy link
Contributor

@oana-l the logs you have there are consistent with the situation where the mergeable requirement is set, the MR has approval rules that prevent merge, and those approval rules have not been met. "Pull request must be mergeable before running apply" is a bit misleading in this context, there's been some discussion about how to unify and clarify terminology across VCSs (#2605). Let me know if that fixes your particular issue.

@dougbw The pipelines still running issue might be related to #3852 (comment), which is an ongoing debate about how to deal with some of the complexities introduced by #3378.

As for the original issue, @syst0m / @dougbw I'm still unable to reproduce any intermittent behavior here, if you had any luck with any of the new logging. It would help understand which parts of PullIsMergeable() are failing, and if none of them are, if there's a log issue elsewhere.

@oana-l
Copy link

oana-l commented Jan 29, 2024

@lukemassa apologies for the misunderstanding, I should have been more specific. I work together with @syst0mn and @dougbw .
The logs I provided above are from the scenario that @syst0m reported initially on the intermittent behaviour, that should be the new logging after we deployed the new atlantis image 0.27.1.

What @dougbw explained on the UI issue is what we noticed is ALSO happening when we get the intermittent error with "Pull request must be mergeable before running apply" although the MR itself is approved.

I hope this is a bit clearer now, thanks in advance!

@lukemassa
Copy link
Contributor

lukemassa commented Jan 29, 2024

Ah I see. I was hoping that we'd get some information out of the output from the API calls to gitlab, but they all appear to have succeeded. Given that, it has to be something in the logic itself that's causing the issue.

{"level":"debug","ts":"2024-01-23T16:23:26.460Z","caller":"vcs/gitlab_client.go:331","msg":"GET   /projects/21602027/commits/COMMIT_REF/statuses returned:   200","json":{}}

This indicates to me that the code got to at least this line: https://github.com/runatlantis/atlantis/blob/v0.27.1/server/events/vcs/gitlab_client.go#L331.

There are no more log lines between there and the end of the function, where I'm fairly confident it's returning false.

Do you have the ability to run atlantis from source? If so I added a bunch more debug lines here: #4186 maybe that'll help us understand what's going on. For example now when I run it against an unapproved MR I get this output:

{"level":"debug","ts":"2024-01-29T13:42:08.227-0500","caller":"vcs/gitlab_client.go:363","msg":"Determined gitlab.PullIsMergeable (which supports DetailedMergeStatus) to be false, detailed merge status is not_approved","json":{}}

If not you could test out some of those API calls and see what they might say, or I could clean up this PR and hopefully get it in to 0.27.2.

Also just a note that atlantis is fully open source and maintained and contributed to by volunteers. I'm happy to continue to try to debug this issue, but also encourage you to dive into the code too if you have issues with it! :)

@fitz7
Copy link
Contributor

fitz7 commented Feb 6, 2024

we're seeing possibly the same issue with Gitlab.com and Atlantis v0.27.1

atlantis-issue

it looks as though Atlantis has set the pipeline status of the commit with a merge-requests/3409/head ref and then never set this status to success so it stays on pending until a new commit is pushed at which point it becomes canceled. which it then interprets as unable to merge

These pipelines with this ref also don't show up in the list of pipelines for the merge request

We've found the easiest fix is to delete this extra pipeline created by Atlantis and then run atlantis apply

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

8 participants