Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core][docs] document nested task resource yielding behavior #2618

Closed
mitar opened this issue Aug 9, 2018 · 15 comments
Closed

[Core][docs] document nested task resource yielding behavior #2618

mitar opened this issue Aug 9, 2018 · 15 comments
Labels
core Issues that should be addressed in Ray Core docs An issue or change related to documentation P1 Issue that should be fixed within a few weeks Ray-2.4 size-small

Comments

@mitar
Copy link
Member

mitar commented Aug 9, 2018

It seems this is currently not yet possible? It would be great if I could spawn multiple tasks and provide priority to them. Something like fun.remote(priority=0.4). In that case if I spawn more tasks than the cluster can support, those with higher priority would start first. For our needs priority could be just a numeric number. And it could just be a hint and not required to really be respected.

(And then after first tasks return, I could kill the rest if I see that I do not need them anymore.)

@atumanov
Copy link
Contributor

atumanov commented Aug 9, 2018

@mitar , it would be great to understand the use case for priority, exactly. Scheduling priority has been used (incorrectly) as an easy to implement mechanism to control the order in which work is done. In many cases, however, priority was used as a proxy for something else (e.g., more urgent tasks with tighter latency constraints). Our scheduler makes an effort to place (and dispatch) tasks in the order they were received. So the tasks already come with this implicit ordering priority I think you are looking for.

Supporting priority explicitly can only be done best effort, as strict ordering is hard to achieve in a distributed setting. Furthermore, to avoid head of line blocking, we may skip over tasks for which there are insufficient resources and schedule ones that can be accommodated instantaneously.

Priority has been improperly used as a scheduling mechanism to refer to (a) urgency, and (b) importance -- both of which can and should be implemented explicitly by adding to the scheduler (a) latency SLO (deadline) awareness (e.g., Inferline), and (b) task/job utility (e.g., TetriSched). So we should focus on the problem you are trying to solve and decide on the best mechanism to solve that problem.

@mitar
Copy link
Member Author

mitar commented Aug 9, 2018

Yes, as I described above, I understand that it can be done only as best effort, so this would be good for me. I am not seeking an optimal solution or something here, just some way of prioritization.

My use case is that we are building an AutoML framework on top of Ray. So we search for pipelines to solve a problem, and then we have a code which tries to guess the quality of the pipeline. Assume that it returns value between 0 and 1, where higher is better. Once we asses the quality, we want to take good ones (some over some threshold) and submit tasks for more realistic evaluation, for example, fully train it with cross validation. So, what I would like to do is that as we are finding candidate pipelines I would like to submit evaluation tasks for good ones, but I would like to prioritize those tasks based on assessed quality.

Because this is an ongoing process, search for pipelines is happening at the same time as we are submitting them for evaluation, I cannot submit those in the order of quality. So I would like to submit them and give Ray some information how to prioritize.

I think this can be of a general usefulness to any type of search on top of Ray. If there are some other approaches for this, I would also be curious.

@suquark
Copy link
Member

suquark commented Aug 11, 2018

@mitar What do you want to do with the task with lowest priority? Do you want to kill it or just pending it until there is enough resource again?

@mitar
Copy link
Member Author

mitar commented Aug 11, 2018

We will see about that. I would not do anything automatic, it should stay around pending, but the caller might decide to kill it as well, once such functionality exists in Ray. But yea, the default is to be pending until there is enough resources.

@drewm1980
Copy link

I can share another use case. I'm evaluating Ray for a vision/robotics application. In my use case, I want to efficiently transfer fresh sets of camera images, point clouds, etc... between stages in a data processing pipeline including 1-2 learned stages running on GPU, and 3-4 stages (written in C++) running on the CPU. There's a 300 ms soft real-time deadline for this pipeline. I would like to log all of the raw data and many intermediate results to disk in parallel without disturbing the real-time data processing pipeline. For my image set, compressing and dumping on the main thread takes about 200ms. Pickling and transferring to another process takes about 100ms. Using Plasma and Arrow I hope I can get the data off the main thread in a trivial amount of time (10 ms?) and into another non-realtime (in the linux scheduler sense) thread or process. If Ray supported prioritization (or an even more direct representation of the deadlines as @atumanov suggests), I could consider using it for setting up the whole pipeline, including data recording.

@atumanov
Copy link
Contributor

@drewm1980 , thanks for sharing this use case!

@mitar
Copy link
Member Author

mitar commented Mar 15, 2019

In meantime we got a new use case here. So we would use Ray Tune to run experiments on the cluster. But while experiments run, user might want to interact through our dashboard with experiments. For example, save some intermediary results. We are currently using Ray task for this saving process. But ideally we should make higher priority than any Ray Tune task.

@virtualluke
Copy link
Contributor

Having task priorities could also help solve issues with #3644 . Deeper nested tasks given I higher priority would keep a deadlock from happening I would think.

@ericl ericl added enhancement Request for new feature and/or capability and removed feature request labels Mar 5, 2020
@simon-mo simon-mo added the P1 Issue that should be fixed within a few weeks label Mar 19, 2020
@ericl ericl added P3 Issue moderate in impact or severity and removed P1 Issue that should be fixed within a few weeks labels May 18, 2020
@wmayner
Copy link

wmayner commented Jan 11, 2022

Seconding @virtualluke's point. This seems critical for any nested task structure.

@pongnguy
Copy link

pongnguy commented May 5, 2022

Although still in alpha, this would be important for workflows since earlier tasks that have a lot of dependent tasks should have higher priority.

@zhenfeng-cao
Copy link

hi @atumanov
has the issue been solved in the latest version of workflow?

intuitively, the steps nested in some other step would have higher priority.
so, the priority could be automatically inferred from this relationship given that users using worflow to organize their tasks.

(I have no idea how ray's worklow constrols task priority internally--- using priority queue?)

could you explain this for me?

thanks

@wmayner
Copy link

wmayner commented Dec 7, 2022

Hi @ericl and @atumanov, do you have any recommendations for the best approach to dealing with potential deadlock/starvation issues with nested tasks? Thanks!

@ericl
Copy link
Contributor

ericl commented Dec 7, 2022

Ray already yields CPU resources when blocked on nested tasks, and implements fair sharing to avoid starvation of deeper tasks. Do you have a concrete example where Ray doesn't successfully execute a nested task graph?

@wmayner
Copy link

wmayner commented Dec 9, 2022

Oh, I didn't realize the scheduler already handles this! That's awesome. Thanks!

@scv119 scv119 added docs An issue or change related to documentation P1 Issue that should be fixed within a few weeks Ray 2.3 core Issues that should be addressed in Ray Core and removed P3 Issue moderate in impact or severity labels Dec 9, 2022
@scv119 scv119 changed the title Task priorities [Core][docs] document nested task resource yielding behavior Dec 9, 2022
@scv119
Copy link
Contributor

scv119 commented Dec 9, 2022

btw the nested task source yielding behavior is documented here: https://docs.ray.io/en/latest/ray-core/tasks/nested-tasks.html#yielding-resources

@rkooo567 rkooo567 added Ray-2.4 and removed Ray 2.3 labels Feb 20, 2023
@rkooo567 rkooo567 added size-small P0 Issues that should be fixed in short order P1 Issue that should be fixed within a few weeks and removed enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks P0 Issues that should be fixed in short order labels Feb 20, 2023
@jjyao jjyao closed this as completed Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Issues that should be addressed in Ray Core docs An issue or change related to documentation P1 Issue that should be fixed within a few weeks Ray-2.4 size-small
Projects
None yet
Development

No branches or pull requests