[Bug]: Unable to remove RDS Global Cluster and associated RDS Clusters at one go #39909

ktrenchev · 2024-10-28T09:04:25Z

Terraform Core Version

0.13.7

AWS Provider Version

4.53.0

Affected Resource(s)

aws_rds_global_cluster
aws_rds_cluster

Expected Behavior

I want to be able to delete both the RDS Global Cluster and the associated RDS Clusters with a single terraform destroy invocation.

Actual Behavior

When terraform destroy is called it:

Detaches the replica RDS Cluster from the Global RDS Cluster, thus triggering a promotion.
Terraform waits for the replica RDS Cluster to be deleted, but times out as the replica RDS Cluster needs to first be promoted and then deleted, but the promotion process takes longed than the timeout.
The replica RDS Cluster is eventually deleted from AWS, but the terraform destroy operation fails to delete the other RDS Cluster and the RDS Global Cluster.
A 2nd run of terraform destroy deletes the leftover RDS Global Cluster and RDS Cluster.

Relevant Error/Panic Output Snippet

waiting for RDS Cluster (XXXXXXX) delete: unexpected state 'promoting', wanted target ''. last error: %!s(<nil>

Terraform Configuration Files

N/A, setup is way too complicated to extract the exact configuration.

Steps to Reproduce

Create a new RDS Global Cluster.
Attach an RDS Cluster (primary).
Attach an RDS Cluster (replica).
Run terraform destroy.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

The text was updated successfully, but these errors were encountered:

github-actions · 2024-10-28T09:04:38Z

Community Note

Voting for Prioritization

Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
Please see our prioritization guide for information on how we prioritize.
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

If you are interested in working on this issue, please leave a comment.
If this would be your first contribution, please review the contribution guide.

justinretzolk · 2024-10-28T18:25:44Z

Hey @ktrenchev 👋 Thank you for taking the time to raise this! While we understand Terraform configurations can get pretty complicated, it's often quite difficult to reproduce scenarios like this without any logging or configuration samples. Are you able to provide debug logs (redacted as necessary) if you're unable to provide a configuration as you'd initially indicated?

One thing that came to up when taking a quick look at this while triaging was the force_destroy argument of the aws_rds_global_cluster resource, which I believe is meant to help with this scenario. Are you able to confirm whether that argument has been configured?

ktrenchev · 2024-10-29T08:26:23Z

Greetings @justinretzolk!,

Unfortunately I'm unable to provide debug logs. I did play around with the force_destroy argument of RDS Global Cluster resource, but it had no effect. I dug around cluster.go myself and my best estimation is:

Either the destruction of the RDS Global Cluster and associated RDS Clusters at one go is intentionally unsupported (AWS docs state something along the lines of "there is no 'one button push' deletion process as RDSs are usually mission critical").
The timeout in waitDBClusterDelete() (called in resourceClusterDelete()) is insufficient as earlier in resourceClusterDelete() RemoveFromGlobalClusterWithContext() is called on the replica and a promotion is triggered.

I'll be happy with a confirmation that the deletion of a Global RDS Cluster and associated RDS Clusters at one go is supported (meaning there is something wrong with my setup, which, unfortunately, is not unlikely).

justinretzolk · 2024-10-29T13:58:56Z

Thanks for the additional information here @ktrenchev 👍 Completely understand re:logging and configuration samples. I'll let someone from the team or community speak to some of the more specifics here.

Edit: I had a thought that using a later provider version may help, given that we've migrated most of the provider to use AWS SDK for Go V2. In doing so, I noticed the following in the release notes for 5.24.0:

resource/aws_rds_cluster: Avoid an error on delete related to unexpected state 'scaling-compute' (rds/cluster: Use AWS backup retention default #34187)

It may be worth upgrading to at least provider version 5.24.0 and testing again to see if that bug fix resolves your particular issue.

Fadih · 2024-11-03T09:54:17Z

@justinretzolk do you know what was changed , i still using same aws provider 5.0.0 like before , but since october 13 its start failing , i cant upgrade my provider to new version because i need to do a lot of changes in my terraform infrastructure

Fadih · 2024-11-03T10:04:08Z

steps to reproduce ,

create aws global db
2)add cluster on west region with one instance
add replica in east region with one instance
try to restack the complete cluster using snapshot

you can see that its start deleting the instance in east region , and then when trying to promote east cluster from the global db , it didn't wait to finish promoting and start the deletion directlly , so its failing on
Error: waiting for RDS Cluster (xxxx-dr-global-region-us-east-2-cluster) delete: unexpected state 'promoting', wanted target ''. last error: %!s()

Fadih · 2024-11-04T09:46:00Z

@justinretzolk i already have the force_destroy on the aws_rds_global_cluster resource and it still happen

github-actions · 2024-12-05T16:07:19Z

Warning

This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them.

Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed.

github-actions · 2024-12-12T19:51:55Z

This functionality has been released in v5.81.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

github-actions · 2025-01-12T02:19:30Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

ktrenchev added the bug Addresses a defect in current functionality. label Oct 28, 2024

github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Oct 28, 2024

justinretzolk added waiting-response Maintainers are waiting on response from community or contributor. service/rds Issues and PRs that pertain to the rds service. labels Oct 28, 2024

github-actions bot removed the waiting-response Maintainers are waiting on response from community or contributor. label Oct 29, 2024

justinretzolk added waiting-response Maintainers are waiting on response from community or contributor. and removed needs-triage Waiting for first response or review from a maintainer. labels Oct 29, 2024

github-actions bot removed the waiting-response Maintainers are waiting on response from community or contributor. label Nov 3, 2024

ewbankkit mentioned this issue Nov 27, 2024

r/aws_rds_cluster: Fix InvalidDBClusterStateFault errors when deleting clusters that are members of a global cluster #40333

Merged

ewbankkit closed this as completed in #40333 Dec 5, 2024

github-actions bot added this to the v5.81.0 milestone Dec 5, 2024

github-actions bot locked as resolved and limited conversation to collaborators Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Unable to remove RDS Global Cluster and associated RDS Clusters at one go #39909

[Bug]: Unable to remove RDS Global Cluster and associated RDS Clusters at one go #39909

ktrenchev commented Oct 28, 2024

github-actions bot commented Oct 28, 2024

justinretzolk commented Oct 28, 2024

ktrenchev commented Oct 29, 2024

justinretzolk commented Oct 29, 2024 •

edited

Loading

Fadih commented Nov 3, 2024 •

edited

Loading

Fadih commented Nov 3, 2024 •

edited

Loading

Fadih commented Nov 4, 2024

github-actions bot commented Dec 5, 2024

github-actions bot commented Dec 12, 2024

github-actions bot commented Jan 12, 2025

[Bug]: Unable to remove RDS Global Cluster and associated RDS Clusters at one go #39909

[Bug]: Unable to remove RDS Global Cluster and associated RDS Clusters at one go #39909

Comments

ktrenchev commented Oct 28, 2024

Terraform Core Version

AWS Provider Version

Affected Resource(s)

Expected Behavior

Actual Behavior

Relevant Error/Panic Output Snippet

Terraform Configuration Files

Steps to Reproduce

Debug Output

Panic Output

Important Factoids

References

Would you like to implement a fix?

github-actions bot commented Oct 28, 2024

Community Note

justinretzolk commented Oct 28, 2024

ktrenchev commented Oct 29, 2024

justinretzolk commented Oct 29, 2024 • edited Loading

Fadih commented Nov 3, 2024 • edited Loading

Fadih commented Nov 3, 2024 • edited Loading

Fadih commented Nov 4, 2024

github-actions bot commented Dec 5, 2024

github-actions bot commented Dec 12, 2024

github-actions bot commented Jan 12, 2025

justinretzolk commented Oct 29, 2024 •

edited

Loading

Fadih commented Nov 3, 2024 •

edited

Loading

Fadih commented Nov 3, 2024 •

edited

Loading