-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Options for Mitigating Input Downtime #68
Comments
👍 |
I think it's a great idea to make this behavior configurable. I think most people would rather drop some logs rather than experience downtime! :) In order for this to work really well, LogStashLogger would need to never block and never raise an exception. This would require a thorough review of the code to make sure this works consistently everywhere. I can see there being different options for this. Don't buffer messages, and drop messages on connection failure. Buffer messages, but drop them if the buffer gets full. Buffer messages, but don't drop them. (e.g. block until the connection is re-established.) |
I think we experienced the same thing recently: our redis system was down, and this seemed to cause long pauses because of timeouts sending logs to it in our Rails application. Is there a way to make a very short timeout for logging? |
The Ruby Redis client defaults to a 5 second timeout. You can override it by passing a different value for |
So we're not going to work on this due to time and resource constraints, but I did want to report back with the solution we went with. We ended up removing this gem entirely. Instead, we're logging to files that are being tailed with Filebeat and shipping events directly to our collectors. I regret that I won't be able to work on this. I'm going to leave this issue open for now as it's still an issue that I believe should be solved at some point. |
I agree that it should be solved. LogStashLogger is essentially an in-process log shipper, and should act in a well-defined, reliable way that does not interfere with normal operation of the application. |
We were bitten by this in production today also. Our ELK stack went down over the weekend and eventually, I believe, the inability to log caused our sidekiq workers to get hung. Would +1 the option to simply lose the data when the buffer is full rather than having the application become unresponsive. |
Thanks much @sauliusgrigaitis for that fix! |
Hello there, we've experienced yesterday the same issue (our Redis endpoint went down and the application quickly became unresponsive). @dwbutler What needs to be done to introduce the ability to drop the data if the connection times out? I'd be happy to help if you give me a few pointers on what code to review (as you were suggesting in this very issue). |
LogStashLogger currently uses `Stud::Buffer` to implement buffering for connectable devices (such as TCP, Redis, etc.) When the remote service goes down, an exception is raised when a buffer flush is attempted. By default `Stud::Buffer` will retry sending the messages forever. Since a flush is triggered when a message is received, or on a regular timer, this will cause logging calls to block. See jordansissel/ruby-stud#28 `Stud::Buffer` allows callbacks to be fired when it encounters a flush error. This ties into that mechanism to abort the flush and re-enqueue the failed messages. This behavior is now enabled by default. To instead drop messages when there is a flush error, pass the new `drop_messages_on_flush_error` option to the logger. Most applications will want to buffer messages and only drop them when the buffer fills up. This behavior has been implemented by tying into `Stud::Buffer`'s callback for the buffer full event. By default, when the buffer is full, `Stud::Buffer` will block when any new message comes in, until there is room in the buffer. If you want to discard messages in the buffer when this happens, pass the new `drop_messages_on_full_buffer` option to the logger. Fixes #68
I finally found some time to work on this. Please try the patch in #81. By default, the logger will no longer block when there is a connection error. If you want to drop messages when the buffer is full, add this new configuration option to your logger: logger = LogStashLogger.new(type: :redis, drop_messages_on_full_buffer: true) |
Thanks @dwbutler for getting this done! Any chance you'll release a new version of the gem anytime soon? |
Yes, my goal is to release sometime this week. |
We recently experienced an issue where our Redis input for Logstash went down and our app became unresponsive, a scenario outlined in the README. As noted in the README, we can bump up the values for the buffer configuration, but it doesn't seem that that will prevent this issue from recurring in the event of another significant logging infrastructure downtime event.
There's the
sync
option, but, based on the documentation, I'm unclear on whether this would have prevented this issue from occurring.It would be great if there was a way for the logger to flush the buffer if it receives a connection error. We'd much rather lose logs than take downtime. Is this something that would be possible? I'd be more than willing to work on a patch and submit a PR if you thought it was possible and worthwhile and could give a little direction.
The text was updated successfully, but these errors were encountered: