-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix raylet bug in driver cleanup #3962
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -304,28 +304,35 @@ void TaskDependencyManager::TaskCanceled(const TaskID &task_id) { | |
|
||
void TaskDependencyManager::RemoveTasksAndRelatedObjects( | ||
const std::unordered_set<TaskID> &task_ids) { | ||
if (task_ids.empty()) { | ||
return; | ||
} | ||
|
||
// Collect a list of all the unique objects that these tasks were subscribed | ||
// to. | ||
std::unordered_set<ObjectID> required_objects; | ||
for (auto it = task_ids.begin(); it != task_ids.end(); it++) { | ||
auto task_it = task_dependencies_.find(*it); | ||
if (task_it != task_dependencies_.end()) { | ||
// Add the objects that this task was subscribed to. | ||
required_objects.insert(task_it->second.object_dependencies.begin(), | ||
task_it->second.object_dependencies.end()); | ||
} | ||
// The task no longer depends on anything. | ||
task_dependencies_.erase(*it); | ||
required_tasks_.erase(*it); | ||
// The task is no longer pending execution. | ||
pending_tasks_.erase(*it); | ||
} | ||
|
||
// TODO: the size of required_objects_ could be large, consider to add | ||
// an index if this turns out to be a perf problem. | ||
for (auto it = required_objects_.begin(); it != required_objects_.end();) { | ||
const auto object_id = *it; | ||
// Cancel all of the objects that were required by the removed tasks. | ||
for (const auto &object_id : required_objects) { | ||
TaskID creating_task_id = ComputeTaskId(object_id); | ||
if (task_ids.find(creating_task_id) != task_ids.end()) { | ||
object_manager_.CancelPull(object_id); | ||
reconstruction_policy_.Cancel(object_id); | ||
it = required_objects_.erase(it); | ||
} else { | ||
it++; | ||
} | ||
required_tasks_.erase(creating_task_id); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we be erasing E.g., if there are some other tasks that still require I may just be confusing myself. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess the best way would be to erase from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add a comment explaining that (or a check checking that that is the case)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah sorry I missed this. But I did add a check to the end of the function that checks whether all of the tasks were erased from |
||
HandleRemoteDependencyCanceled(object_id); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line is a no-op given the above check, right? Since the implementation is void TaskDependencyManager::HandleRemoteDependencyCanceled(const ObjectID &object_id) {
bool required = CheckObjectRequired(object_id);
// If the object is no longer required, then cancel the object.
if (!required) {
auto it = required_objects_.find(object_id);
if (it != required_objects_.end()) {
object_manager_.CancelPull(object_id);
reconstruction_policy_.Cancel(object_id);
required_objects_.erase(it);
}
}
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it cancels the object with the |
||
} | ||
|
||
// Make sure that the tasks in task_ids no longer have tasks dependent on | ||
// them. | ||
for (const auto &task_id : task_ids) { | ||
RAY_CHECK(required_tasks_.find(task_id) == required_tasks_.end()) | ||
<< "RemoveTasksAndRelatedObjects was called on" << task_id | ||
<< ", but another task depends on it that was not included in the argument"; | ||
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we not do this anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens inside
HandleRemoteDependencyCanceled
.