Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased cf push times + CF API unavailability since CAPI 1.132.0 #285

Closed
pommi opened this issue Dec 12, 2022 · 0 comments · Fixed by cloudfoundry/cloud_controller_ng#3111

Comments

@pommi
Copy link

pommi commented Dec 12, 2022

Issue

Since the upgrade to cf-deployment >= v21.7.0 / CAPI 1.132.0, we're noticing increased deployment times when pushing apps to Cloud Foundry, eventually leading to temporary CF API (cloud controller) unavailability.

Context

In CAPI 1.132.0, Ruby 2 was upgraded to Ruby 3. In Ruby 3 Digest functions seem to be up to ~5x slower (source: ruby/digest#35), which means calculating MD5/SHA1/SHA256 hashes of a droplet became ~5x slower (~5x more resource intensive/consuming).

Steps to Reproduce

To easily reproduce the issue in a Cloud Foundry deployment:

  • Have 1 api instance (AWS t3.medium / 2 vCPUs, 4G memory)
  • Deploy (cf push) 8 applications in parallel of for example a golang sample app

Expected result

  • Deployment times equal to running CAPI < 1.132.0
  • No failures while deploying 8 apps in parallel
  • No CF API unavailability

Current result

  • During droplet processing, CPU on the api instance spikes to 100% up to 1 minute
  • perf-top shows CPU is mainly consumed by (see attached screenshot):
    • cc-worker: sha2.so / rb_Digest_SHA256_Transform
    • cloud-controller: md5.so / md5_process
    • cc-worker: libcrypto.so.1.1 / SHA1_Init
  • Requests to CF API become slower and eventually return 404 Not Found: Requested route ("api.xxx.xyz") does not exist.

Possible Fix

Currently we're partially mitigating the issue by throwing more resources at it, but we're still seeing increased deployment times, because I think calculating sha hashes is a single-threaded operation, limited to the max performance a single CPU core can offer.

It looks like the root cause of this issue can be addressed by implementing the proposed workaround in ruby/digest#35, by switching from using digest to openssl.

perf-top Screenshot

perf-top-cf-api-all-processes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants