Marvel eats up Master's heap #9130

mosiddi · 2015-01-04T06:11:30Z

In one of my experiments, I was trying to see how much admin request load a master can handle before it throws the timeout exceptions.

My master machine configuration was A2 Azure VM and I had 7 node cluster (3 queries, 3 data and 1 master). I tried very simple experiment -

"Spawn 1000 create index request to master in different threads and stop all thread on seeing first exception"

I was able to create ~650 indexes. What I noticed after a few hours was my master's heap had a pattern of growing from 25% to 75% and coming back to 25% every time. There were few failures as well when it was @75%. The call stack was Marvel's exporter and was doing Index Stats.

2 questions -

Does Marvel takes some more stats like index, cluster from master node when compared to other nodes?
How Marvel ensures the heap it is consuming is not eating up too much?

The text was updated successfully, but these errors were encountered:

clintongormley · 2015-01-05T11:40:30Z

@mosiddi what version of Elasticsearch are you using? There have been some issues with certain stats calls which made them quite slow. These should have been fixed in v1.4.2.

Also, do you have swap enabled? I'm wondering if you're seeing slow GCs thanks to swapping.

mosiddi · 2015-01-05T11:56:06Z

hi @clintongormley,
We are using v1.3.4. It is a windows azure VM and we haven't set anything explicit related to swap so the default settings are present.

clintongormley · 2015-01-05T12:54:11Z

@mosiddi with the default settings you're bound to have problems with slow GC. Swap is the enemy of the JVM.

Also, I suggest upgrading Elasticsearch before retrying your tests. A few stats issues have been fixed since then.

I'll leave this open for now, until you have had a chance to rerun your tests without swap and with the latest version of Elasticsearch.

mosiddi · 2015-01-05T13:00:47Z

Thanks @clintongormley ! I'll try the tests out with 1.4.2. Can you point me to instructions which details how to disable swap for ES in Windows VMs?

clintongormley · 2015-01-05T13:45:16Z

@mosiddi I can't I'm afraid, but I'd just google disabling windows page file?

mosiddi · 2015-01-05T13:54:42Z

Thanks @clintongormley , I'll look into this and keep you updated,

mosiddi · 2015-01-06T11:25:24Z

I updated ES to latest version (1.4.2) on my ES master and can still see the same pattern. I'll turn off paging and see if that helps.

mosiddi · 2015-01-06T13:39:33Z

Tried updating the VM's page file size (virtual memory) also to 0 MB and didn't see any noticeable difference in heap usage.

Please note that when i was doing the experiments, I wasn't doing any admin operations on cluster and I had ~700 indices.

When I set the marvel.agent.indices to only one index, the heap usage came down.

bleskes · 2015-01-07T08:36:06Z

A quick answer to:

Does Marvel takes some more stats like index, cluster from master node when compared to other nodes?

Marvel uses the master node to issue indices stats calls (and indeed cluster stats). This means the master acts as a coordinating node for this. At the moment the indices stats calls translate to one request per shard (see #7990) which results in some load on the master and the receiving nodes. That said, marvel does the call, waits for it to complete, sleeps for 10s (default) and does it again. Even with 700 indices (7000 shards, assuming defaults) the load should be low.

mosiddi · 2015-01-07T09:05:13Z

Thanks @bleskes ... The pattern I saw was on an interval of ~1 hour, heap grew from 25% to 75% and came back... and the pattern continued.

Do you want any more data from my side to look further into this?

bleskes · 2015-01-07T09:31:32Z

@mosiddi do see a quick spike in memory use or just a slow growth and then a quick decline?

mosiddi · 2015-01-07T09:48:41Z

slow growth (less than an hour) and a quick decline (in few minutes)

bleskes · 2015-01-07T10:09:09Z

sounds like the normal garbage collection of Java - slow growth of memory and a certain limit is reached (75% in ES) and then it's cleaned up, hence the quick drop. I'm going to close this issue, feel free to reopen if you feel there is anything else going on.

mosiddi · 2015-01-07T13:47:17Z

One more interesting observation - When I stop marvel generating index aliases, and do a few set of create alias operations, I do see same memory pattern and timeouts. This time the growth is within 15 minutes range.

The # of existing indexes is ~1300 in the test bed where I saw issue.

mosiddi · 2015-01-07T15:39:39Z

@bleskes and @clintongormley : Though the above issue i mention is not related to Marvel but seen in same test setup... Do you have any insight on why would create alias timeout when the master CPU is within 25% and aliases creation requests are coming at rate of 5 - 10 per minute

clintongormley added the feedback_needed label Jan 5, 2015

clintongormley mentioned this issue Jan 5, 2015

Invalid indexes getting created when master crashes due to OOM #9129

Closed

bleskes closed this as completed Jan 7, 2015

mosiddi changed the title ~~Marvel eats up Master's heap~~ Index Create requests eats up heap and timesout (was Marvel eats up Master's heap) Jan 7, 2015

mosiddi changed the title ~~Index Create requests eats up heap and timesout (was Marvel eats up Master's heap)~~ Index Alias Create requests eats up heap and timesout (was Marvel eats up Master's heap) Jan 7, 2015

mosiddi changed the title ~~Index Alias Create requests eats up heap and timesout (was Marvel eats up Master's heap)~~ Marvel eats up Master's heap Jan 8, 2015

mosiddi mentioned this issue Jan 8, 2015

Index Alias Create requests eats up heap and timesout #9192

Closed

jpountz mentioned this issue Aug 18, 2015

Add mechanism for transporting shard-level actions by node #12944

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marvel eats up Master's heap #9130

Marvel eats up Master's heap #9130

mosiddi commented Jan 4, 2015

clintongormley commented Jan 5, 2015

mosiddi commented Jan 5, 2015

clintongormley commented Jan 5, 2015

mosiddi commented Jan 5, 2015

clintongormley commented Jan 5, 2015

mosiddi commented Jan 5, 2015

mosiddi commented Jan 6, 2015

mosiddi commented Jan 6, 2015

bleskes commented Jan 7, 2015

mosiddi commented Jan 7, 2015

bleskes commented Jan 7, 2015

mosiddi commented Jan 7, 2015

bleskes commented Jan 7, 2015

mosiddi commented Jan 7, 2015

mosiddi commented Jan 7, 2015

Marvel eats up Master's heap #9130

Marvel eats up Master's heap #9130

Comments

mosiddi commented Jan 4, 2015

clintongormley commented Jan 5, 2015

mosiddi commented Jan 5, 2015

clintongormley commented Jan 5, 2015

mosiddi commented Jan 5, 2015

clintongormley commented Jan 5, 2015

mosiddi commented Jan 5, 2015

mosiddi commented Jan 6, 2015

mosiddi commented Jan 6, 2015

bleskes commented Jan 7, 2015

mosiddi commented Jan 7, 2015

bleskes commented Jan 7, 2015

mosiddi commented Jan 7, 2015

bleskes commented Jan 7, 2015

mosiddi commented Jan 7, 2015

mosiddi commented Jan 7, 2015