-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't Panic on Missing Dirs #1874
Conversation
clusterID := b.Broker.Log().Config().ClusterID | ||
go s.StartReportingLoop(version, clusterID) | ||
// Make sure we have a config object b4 we try to use it. | ||
if configObj := b.Broker.Log().Config(); configObj != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could this happen? The nil
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's what was causing the panics initially seen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, I think I've defended this from ever actually happening.
Couple of things -- I think the design of checking if directories exist to determine the role of a node (server, data, or both) is broken. I think we should have some explicit state somewhere storing this (and derived from the "role" if it doesn't exist). Then check if the required directories exist, according to the state. I believe @toddboom agrees with me. Also, this change seems to be only valid if a node is both a broker and a server. This is not necessarily the case. |
Oh sure, I think just checking for directory existence isn't a very strong check of viability. Nevertheless, this is what the previous code did. This change doesn't address node-only or broker-only configurations. The run function just tries to open a broker and to start a server. I haven't changed any of that logic, I don't think. |
OK, @rothrock and I spoke about this. I wanted to be clear that there is no issue unless the directory tree is in an inconsistent state. This could only happen (that I know of) if a Raft directory was deleted, and not the data directory, or vice-versa. Re-creating the directories should fix the issue, but I'm not sure it's what the user wanted since they may have just attempted to blow everything away. Not sure. I agree this code definitely needs work though -- it's got rather complex. It will be overhauled when "role" support comes back in. |
Perhaps we could add logging to the output when the directories are recreated? That way at least we warn the user that we have found the tree in an inconsistent state, and attempting to repair. |
Sure, I can say "...not found, creating..." |
Issue: 1753 Figure out what's missing, create it, and keep going.
OK, thanks. +1 from me. |
Conflicts: CHANGELOG.md
Issue: 1753
Figure out what's missing, create it, and keep going.