Memory usage while consuming #81

winbatch · 2014-02-18T22:14:48Z

Have you ever seen a case where when consuming, if the consumer can't keep up (let's say he is writing each message to slow disk), that the amount of memory explodes? I'm seeing that. I've run valgrind and don't see any leaks, etc. thinking librdkafka is caching msgs but I haven't changed any of the defaults in terms of fetch.message.max.bytes, etc.

winbatch · 2014-02-18T22:35:48Z

Also- I see there is a 'queued.min.messages' but no queued.max.messages?

winbatch · 2014-02-18T22:54:38Z

In case if wasn't obvious, can reproduce with sleep () rather than with slow disk

edenhill · 2014-02-19T03:53:40Z

rdkafka will fetch more messages as long as there are less than queued.min.messages in the local queue, and for each fetch it will request up to fetch.message.max.bytesworth of messages, so no need for a queued.max.messages.

winbatch · 2014-02-19T04:11:31Z

Yes, but what if I want it to STOP fetching messages? My issue is that without a max, it will continue to fetch and use huge amounts of memory.

On Tuesday, February 18, 2014, Magnus Edenhill [email protected]
wrote:

rdkafka will fetch more messages as long as there are less than
queued.min.messages in the local queue, and for each fetch it will
request up to fetch.message.max.bytesworth of messages, so no need for a
queued.max.messages.

Reply to this email directly or view it on GitHubhttps://github.com//issues/81#issuecomment-35463683
.

edenhill · 2014-02-19T05:52:25Z

It will stop when the queued.min.messages threshold is reached, and start fetching again whenever it drops below that threshold.

winbatch · 2014-02-20T02:04:34Z

ok - that worked. Although I wouldn't mind if you offered a config parameter that says when the total bytes queued is 'X', then stop consuming. This allows you to control the amount of memory used when you don't know the size of the upcoming messages available on the wire.

edenhill · 2014-02-20T02:10:56Z

The maximum memory consumption (per consumed toppar) should be close to queued.min.messages * message.max.bytes.

edenhill · 2014-02-20T03:14:07Z

Allowing some factor of overhead, does this match the memory consumption you're seeing in your application?

winbatch · 2014-02-21T03:59:09Z

That's hard for me to know since the message sizes coming in random. That's why I'd like to be able to control the number of messages based on memory. This way if messages coming in are small, I can ask for more of them. If messages are large, I don't run out of memory. I am not worried about getting a single huge message such that message.max.bytes really figures in. Let's say I only want to queue up 500 MB of memory. I'd like to be able to configure it such that librdkafka gets as many messages queued as possible without going over.

edenhill · 2014-02-21T04:13:25Z

Okay, a queued.min.bytes property.
So, if any of queued.min.messages or queued.min.bytes thresholds are reached it will pause fetching until both of the levels drop below the threshold again. Okay with you?

winbatch · 2014-02-21T04:14:28Z

A queued.max.bytes property. It's the threshold beyond which I don't want
to cross. I also find queued.min.messages naming confusing. It feels like it should be named 'max' based on how you described the logic above. But the feature is more important to me than the name, so up to you ;)

On Thu, Feb 20, 2014 at 11:13 PM, Magnus Edenhill
[email protected]:

Okay, a queued.min.bytes property.
So, if any of queued.min.messages or queued.min.bytes thresholds are
reached it will pause fetching until both of the levels drop below the
threshold again. Okay with you?

Reply to this email directly or view it on GitHubhttps://github.com//issues/81#issuecomment-35697733
.

edenhill · 2014-02-21T04:27:53Z

Myeah, it might seem a little odd.
queued.min.messages is the threshold toggling if rdkafka should fetch more messages or not, it does not really control a maximum number of queued messages, even though it will have that effect as well since it will stop fetching once this threshold is reached.
But it will not queue exactly that number of messages, but rather up to queued.min.messages-1 + the-number-of-messages-received-in-the-last-fetch-batch (which depends on message.max.bytes).

If I added a queued.max.bytes property it would be a soft-limit, it would still queue all messages received in the fetch reply, possibly overshooting the queued.max.bytes value by some amount (up to message.max.bytes). Otherwise rdkafka would have to drop received messages to satisfy the maximum queue limit, just to refetch them in a short while again.

I'm rambling.

winbatch · 2014-02-21T04:29:47Z

Yeah, you're rambling. Bottom line, I want to safe guard against running
out of memory but not limiting myself to a certain number of messages. If
you've got a cleaner way to do that, that's fine. I don't care if it gets
overshot by a little bit.

On Thu, Feb 20, 2014 at 11:27 PM, Magnus Edenhill
[email protected]:

Myeah, it might seem a little odd.
queued.min.messages is the threshold toggling if rdkafka should fetch
more messages or not, it does not really control a maximum number of queued
messages, even though it will have that effect as well since it will stop
fetching once this threshold is reached.
But it will not queue exactly that number of messages, but rather up to
queued.min.messages-1 +
the-number-of-messages-received-in-the-last-fetch-batch (which depends on
message.max.bytes).

If I added a queued.max.bytes property it would be a soft-limit, it would
still queue all messages received in the fetch reply, possibly overshooting
the queued.max.bytes value by some amount (up to message.max.bytes).
Otherwise rdkafka would have to drop received messages to satisfy the
maximum queue limit, just to refetch them in a short while again.

I'm rambling.

Reply to this email directly or view it on GitHubhttps://github.com//issues/81#issuecomment-35698271
.

edenhill · 2014-02-21T05:07:04Z

I'll add queued.max.message.kbytes that does what is defined above (including possible overshoot).

) Defaults to 1 gig. This also adds "fetchq_size" (alongside "fetchq_cnt") to the stats output.

edenhill · 2014-03-14T05:09:04Z

Have you had time to verify that queued.max.messages.kbytes solves your problem?

winbatch · 2014-03-18T02:20:07Z

Sorry, haven't had a chance yet.

edenhill · 2014-03-18T14:50:00Z

Okay, I'll close it anyway, reopen if you see the problem again (impossible!).

kant111 · 2017-10-31T11:07:11Z

@winbatch Going through this discussion I am wondering why you ran out of memory ? The default value for queued.min.messages is 100,000 and say your message size 20KB(which is a lot but whatever) then 100K * 20KB = 2GB. so what is the size of your message and the memory you got?

winbatch · 2017-10-31T11:15:32Z

@kant111 - This was more than 3 1/2 years ago and I'm sure many versions of kafka and librdkafka ago.

kant111 · 2017-10-31T11:39:47Z

@winbatch I am sorry I did not understand your latter part of the sentence.

winbatch closed this as completed Feb 19, 2014

winbatch reopened this Feb 19, 2014

edenhill added a commit that referenced this issue Feb 22, 2014

Added queued.max.messages.kbytes (local consumer queue limit) (issue #81

f05cd54

) Defaults to 1 gig. This also adds "fetchq_size" (alongside "fetchq_cnt") to the stats output.

edenhill added the enhancement label Feb 22, 2014

edenhill modified the milestone: 0.8.4 Feb 22, 2014

edenhill closed this as completed Mar 18, 2014

MockingJayWong mentioned this issue Jul 9, 2019

rd_kafka_topic_new after using admin api #2399

Closed

7 tasks

mausch mentioned this issue Dec 13, 2019

Unexpected memory usage when assigning partitions with AutoOffsetReset.Earliest confluentinc/confluent-kafka-dotnet#1148

Closed

8 tasks

Furuta-Masakazu-quick mentioned this issue Sep 29, 2020

Producer detects "All broker connections are down" when rolling restarts #3090

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage while consuming #81

Memory usage while consuming #81

winbatch commented Feb 18, 2014

winbatch commented Feb 18, 2014

winbatch commented Feb 18, 2014

edenhill commented Feb 19, 2014

winbatch commented Feb 19, 2014

edenhill commented Feb 19, 2014

winbatch commented Feb 20, 2014

edenhill commented Feb 20, 2014

edenhill commented Feb 20, 2014

winbatch commented Feb 21, 2014

edenhill commented Feb 21, 2014

winbatch commented Feb 21, 2014

edenhill commented Feb 21, 2014

winbatch commented Feb 21, 2014

edenhill commented Feb 21, 2014

edenhill commented Mar 14, 2014

winbatch commented Mar 18, 2014

edenhill commented Mar 18, 2014

kant111 commented Oct 31, 2017 •

edited

Loading

winbatch commented Oct 31, 2017 •

edited

Loading

kant111 commented Oct 31, 2017

Memory usage while consuming #81

Memory usage while consuming #81

Comments

winbatch commented Feb 18, 2014

winbatch commented Feb 18, 2014

winbatch commented Feb 18, 2014

edenhill commented Feb 19, 2014

winbatch commented Feb 19, 2014

edenhill commented Feb 19, 2014

winbatch commented Feb 20, 2014

edenhill commented Feb 20, 2014

edenhill commented Feb 20, 2014

winbatch commented Feb 21, 2014

edenhill commented Feb 21, 2014

winbatch commented Feb 21, 2014

edenhill commented Feb 21, 2014

winbatch commented Feb 21, 2014

edenhill commented Feb 21, 2014

edenhill commented Mar 14, 2014

winbatch commented Mar 18, 2014

edenhill commented Mar 18, 2014

kant111 commented Oct 31, 2017 • edited Loading

winbatch commented Oct 31, 2017 • edited Loading

kant111 commented Oct 31, 2017

kant111 commented Oct 31, 2017 •

edited

Loading

winbatch commented Oct 31, 2017 •

edited

Loading