-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken shutdown if scheduling from event loop context #550
Fix broken shutdown if scheduling from event loop context #550
Conversation
@@ -334,6 +334,7 @@ final class Scheduler : IScheduler | |||
auto caller_task = Task.getThis(); | |||
if (caller_task !is null) | |||
caller_task.kill(); | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, but it might be worth adding a comment as to why the calling task is killed if schedule
is called during shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add
src/ocean/task/Scheduler.d
Outdated
@@ -615,6 +615,8 @@ final class Scheduler : IScheduler | |||
public void processEvents ( ) | |||
{ | |||
auto task = Task.getThis(); | |||
assert(task !is null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not verify
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will crash anyway even if check is missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mm I don't get what you mean. Aren't these the possible cases in D2 builds?
assert
,task is null
: crash.assert
,task !is null
: keeps running.verify
,task is null
: throws.verify
,task !is null
: keeps running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more options depending on -release
but essentially yes. But don't forget that it crashes right now if condition is false because there is no check at all. So this exactly matches assert purpose - turn crash / corruption into an easier debuggable crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But didn't we go through and convert all such cases into verify
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that was the case, we have wasted a lot of effort on something fundamentally flawed. You can't avoid crashes no matter how hard you try and must embrace them :) From my PoV goal of verify
is to protect against memory corruptions and similar damage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To word it differently - the fact that right now dhtnode can hit an assert
(now verify
) condition and survive after is a bug in dhtnode that I totally expect to be fixed some day, not a feature to support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fact that right now dhtnode can hit an assert (now verify) condition and survive after is a bug in dhtnode that I totally expect to be fixed some day
This is the key. Using assert
s to notify us of such bugs causes crashes. Using verify
s to notify us of such bugs (potentially) avoids crashes and allows us to log the bug. From the POV of a dhtnode, the latter is way more desirable than the former.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. Consider this PR - before my change it would always just segfault on wrong usage, in both D1 and D2. There are no asserts, just plain segfault. The very fact that it was working like that in deployed live apps for a very long time is a direct proof that this is perfectly fine.
This is reality that can't be worked around - no matter what policy you take on assert
, your app will crash eventually and bugs will have to be fixed. Effort to make it as harmless as possible is meaningful, effort to prevent it altogether is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The very fact that it was working like that in deployed live apps for a very long time is a direct proof that this is perfectly fine.
It's only fine because it never came up. If it had have come up, an assert
here would be as bad as a plain segfault, in terms of direct impact to the application.
Effort to make it as harmless as possible is meaningful, effort to prevent it altogether is not.
This is exactly what I'm suggesting. Throwing is generally less harmful than crashing. Both have equal value in terms of notifying maintainers of bugs.
Updated with a comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Ping. Can we please label this PR as high priority? I need this patch to make a daemon application works properly. |
I will make a release as soon as @gavin-norman-sociomantic gives a green light :) |
Because of missing return statement, trying to schedule a new task after shutdown would succeed but result in application infinitely stuck during shutdown sequence. With this fix, any such tasks will be ignored allowing for clean exit. There is a concern of losing app data but it will have to be addressed in a separate changest, targetting minor or maybe even major branch.
Green light. |
With this fix, any such tasks will be ignored allowing for clean exit.
There is a concern of losing app data but it will have to be addressed
in a separate changest, targetting minor or maybe even major branch.