Make queue size configurable by workers #52

1597463007 · 2025-01-29T17:53:09Z

Currently queue size is configured by the scheduler is applied globally across all connected workers. This was deemed not a problem up until the introduction of the IBM Spectrum Symphony worker which features workers with alternative implementations. Some workers running on higher specced hosts or workers that can execute multiple tasks concurrently should be allowed to accept more tasks into its queue.

This issue will track the progress of moving the queue size configuration into the worker side.

The worker will send the queue size info using the heartbeat payload and the existing --per-worker-queue-size scheduler flag will become a noop and will be removed in the future.

The text was updated successfully, but these errors were encountered:

gxuu · 2025-01-31T21:24:58Z

I've rolled a basic solution according to the requirements outlined in issue #52. I welcome any comments and advice to further improve the program.

This pull request doesn't fully comply with the guidelines. Here's a list of the missing items:

I didn't increment the version number. While this change breaks compatibility, it's not a significant feature. It might be best to address this when rolling out major updates.
I didn't write tests. The implementation passed all existing tests, and I've also manually tested the code. I'm willing and eager to write tests after receiving feedback.
I didn't craft the code, and the naming was bad.

I'm also confused about the following:

It seems "servers" are organized into "Clusters," and each cluster consists of several workers. Since all workers within a cluster run on the same machine, why not specify the queue size at the cluster level?

Thanks,
gxu

1597463007 · 2025-01-31T22:23:26Z

I agree the terminology can be confusing. In Dask, Dask workers is analogous to Scaler clusters as each Dask worker can hold more than one process to execute tasks.

In Scaler's context, "Cluster" means a group of workers running under the same parent PID. It's mainly used to make clean up the workers easier.

The term has diverged a bit from the original meaning as more worker implementations are created. E.g. The IBM Spectrum Symphony Worker is for all intents and purposes a "Worker" but it behaves more like a "Cluster" as communicates with the Symphony Grid Scheduler (and by extension Symphony Grid Workers) and can run tasks concurrently.

The cluster level doesn't have the ability to handle messages, messages are sent directly between the Scheduler and Workers.

1597463007 changed the title ~~Make queue size configurable by the worker~~ Make queue size configurable by workers Jan 29, 2025

gxuu mentioned this issue Jan 31, 2025

Basic Implementation Making Queue Size Configurable by Workers #53

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make queue size configurable by workers #52

Make queue size configurable by workers #52

1597463007 commented Jan 29, 2025

gxuu commented Jan 31, 2025

1597463007 commented Jan 31, 2025

Make queue size configurable by workers #52

Make queue size configurable by workers #52

Comments

1597463007 commented Jan 29, 2025

gxuu commented Jan 31, 2025

1597463007 commented Jan 31, 2025