Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix COMPOSE_PARALLEL_LIMIT #8226

Closed
pagelypete opened this issue Mar 22, 2021 · 18 comments
Closed

Fix COMPOSE_PARALLEL_LIMIT #8226

pagelypete opened this issue Mar 22, 2021 · 18 comments

Comments

@pagelypete
Copy link

pagelypete commented Mar 22, 2021

Description of the issue

There have already been multiple issues opened about this bug:
#7486
#5864

A PR already seems to exist which fixes the issue which was posted with reference to #5864 - could this be checked, tested, and merged so this bug can finally be fixed in master?

The bug causes a fairly critical issue, it completely deadlocks compose, and in the worst case scenario, it also seems to be able to cause interactions with the Docker daemon to hang due to the amount of open connections it is holding.

The bug is worse when your compose file is dynamically generated, and can contain varying numbers of services, which means that you have to set a global system wide environment variable to be an arbitrarily high number - completely defeating the point of the variable in the first place (to reduce CPU usage for container operations).

Anecdotally, I have tested that PR on 1.28.5, and up/down/restart/etc. operations all seem to work. In a service file with 150 services, this deadlocks on master:

COMPOSE_PARALLEL_LIMIT=2 docker-compose up -d --remove-orphans

With the linked PR, it does not, and actually functions as you would expect (essentially sequentially bringing up the containers)

Context information (for bug reports)

Output of docker-compose version

docker-compose version 1.28.5, build unknown
docker-py version: 4.4.4
CPython version: 3.8.6
OpenSSL version: OpenSSL 1.1.1f  31 Mar 2020

Steps to reproduce the issue

  1. Create a docker-compose.yml file with a lot of services (ideally 100+)
  2. Set COMPOSE_PARALLEL_LIMIT to 2 and run an up operation

Observed result

Compose will deadlock.

Expected result

Compose should bring up containers using the parallel limit.

Stacktrace / full error message

N/A

Additional information

Not relevant

@devZer0
Copy link

devZer0 commented Apr 1, 2021

hello, we also have massive problems with docker-compose. it does not work reliable to start containers on a host . we use docker-compose part of a ci/cd environment and jenkins job fail multiple times a day because docker-compose will not reliably return "success" to bring a bunch of containers online on a remote host via ssh.

@devZer0
Copy link

devZer0 commented Sep 8, 2021

now, months later this is still open and not further commented. is docker-composed orphaned/abandoned software nobody cares for ?

@ndeloof
Copy link
Contributor

ndeloof commented Sep 8, 2021

Have you tried Docker Compose v2? With a new golang codebase, I guess concurency is better addressed.

@pagelypete
Copy link
Author

pagelypete commented Sep 8, 2021

Have you tried Docker Compose v2? With a new golang codebase, I guess concurency is better addressed.

I personally have not tried v2 however the issue here was the load caused by doing so many Docker operations at once, not any issues with the performance of compose itself. Is there a way to limit parallel Docker operations in v2 similar to COMPOSE_PARALLEL_LIMIT (except working properly)?

Either way, it would be really great to get this changed merged, we have been using it in production for a long time now and have to build our own docker-compose just to use it, and it really does seem to fix the parallel problem and make COMPOSE_PARALLEL_LIMIT work properly. Servers that would otherwise be hugely overloaded can instead work normally.

@pagelypete
Copy link
Author

@ndeloof I gave v2 a try and I don't see any way to limit parallelism, and while I can see that performance of compose itself has been hugely improved, I'm concerned about not seeing a way to limit what it sends to the Docker API in v2 at all (I grepped the source a little).

For example if you have a compose file with 200 containers, that can otherwise function fine on a system, running docker-compose up results in an absolutely massive load spike, whereas with a parallel limit in place it can be done more slowly but with no system load issues at all.

@devZer0
Copy link

devZer0 commented Sep 9, 2021

we also cannot use v2 as there is no mechanism to limit parallelism. our servers are getting totally overloaded if there are too many services getting started in parallel, i.e. services fail to start correctly if there is too much load, so please either fix v1 or add COMPOSE_PARALLEL_LIMIT to v2

@devZer0
Copy link

devZer0 commented Sep 14, 2021

we are also using the patch in production now, with good results so far

@devZer0
Copy link

devZer0 commented Sep 24, 2021

i'm very sad that nobody cares and this is being ignored

@ondravondra
Copy link

We also ran into the issue of massive spike in memory and cpu usage when building. A way to limit parallelism is a must have for us.

@WolfspiritM
Copy link

We also can not use compose v2 as that just freezes our buildserver. We have more then 20 docker images in a docker compose file and building all at once just breaks everything. I noticed that buildkit has an option "max-parallelism" which should prevent that but in our case "docker buildx bake" fails for our docker-compose file with "mapping values are not allowed in this context".

As far as I understand from the code docker compose v2 is "hardcoded" to use the default buildkit docker driver. I can't see any way to set max-parallelism for that instance. Would it be possible to pass a buildx builder to docker-compose which can be used instead of the default one? That way we can at least create a builder with "docker buildx create --name bla --buildkitd-flags '--oci-worker-max-parallelism=1'" and pass that to docker compose maybe something like "docker compose build --builder bla" or maybe docker compose can just use the one that is set to be used by buildx.

The buildx issue related to that is docker/buildx#359

@pagelypete
Copy link
Author

We also can not use compose v2 as that just freezes our buildserver. We have more then 20 docker images in a docker compose file and building all at once just breaks everything. I noticed that buildkit has an option "max-parallelism" which should prevent that but in our case "docker buildx bake" fails for our docker-compose file with "mapping values are not allowed in this context".

As far as I understand from the code docker compose v2 is "hardcoded" to use the default buildkit docker driver. I can't see any way to set max-parallelism for that instance. Would it be possible to pass a buildx builder to docker-compose which can be used instead of the default one? That way we can at least create a builder with "docker buildx create --name bla --buildkitd-flags '--oci-worker-max-parallelism=1'" and pass that to docker compose maybe something like "docker compose build --builder bla" or maybe docker compose can just use the one that is set to be used by buildx.

The buildx issue related to that is docker/buildx#359

It sounds like this would go some way to fixing the issue with relation to building but not when bringing containers up (which was actually the original use-case triggering the issue for us).

I think the biggest problem is getting compose devs to actually see this issue - there are so many open issues that I guess many get lost, and I suspect that since this one has been labelled as being a compose v1 issue, it may not get looked at any more. Because of this, I have opened this issue to request the feature in compose v2 - #8849

@johanneswuerbach
Copy link

With a complex compose stack I'm seeing frequent:

error getting credentials - err: fork/exec REMOVED/docker-credential-ecr-login: too many open files, out: ``

when running docker compose pull.

So an option to limit pull parallelism would be great.

@stale
Copy link

stale bot commented Sep 21, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 21, 2022
@jeffrson
Copy link

push

@stale
Copy link

stale bot commented Nov 2, 2022

This issue has been automatically closed because it had not recent activity during the stale period.

@stale stale bot closed this as completed Nov 2, 2022
@devZer0
Copy link

devZer0 commented Nov 2, 2022

stale bot sucks

not fixing bugs sucks , too.

@pagelypete
Copy link
Author

Yep, quite annoying since @jeffrson bumped it right after the label was added...

@milas
Copy link
Contributor

milas commented Jan 30, 2023

The issue for tracking the lack of support in Compose v2 for COMPOSE_PARALLEL_LIMIT is at #9091 and support was added in v2.15.

I'm going to lock this now to prevent further confusion because this issue was tracking a potential deadlock in Compose v1 when using COMPOSE_PARALLEL_LIMIT.

If you have issues with Compose v2.15+ and COMPOSE_PARALLEL_LIMIT, please create a new bug report.

@docker docker locked as off-topic and limited conversation to collaborators Jan 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants