Use new rabbitmqctl features for monitoring #916

binarin · 2016-08-10T11:47:44Z

@dmitrymex @bogdando WDYT? I haven't tested it yet, but if you are OK with overall shape of this patch, I'll start polishing and testing it.

To stop wasting network bandwidth during health checks (e.g. list_queues
in 3-node cluster with 10k queues costs on average 12 megabytes of
traffic and 27k TCP packets).

Features are disabled by default to preserve compatibility, but they
SHOULD be enabled when following patches are present in currently used
rabbitmq version:

dmitrymex · 2016-08-10T16:04:45Z

scripts/rabbitmq-server-ha.ocf

+    if [ "$rc_timeouts" -eq 2 ]; then
+        master_score 0
+        return $OCF_ERR_GENERIC
+    elif [ $rc -ne 0 ]; then


Here should be something like (rc_timeouts == 0 AND rc != 0) because if (rc_timeouts == 1 AND rc == 137), that is a timeout which should be ignored

dmitrymex · 2016-08-11T11:06:47Z

scripts/rabbitmq-server-ha.ocf

+stopped/demoted.
+</longdesc>
+<shortdesc lang="en">Use --local option for list_queues</shortdesc>
+<content type="string" default="${OCF_RESKEY_rmq_feature_local_list_queues_default}" />


Here and above you can use type="boolean", like it is done for 'debug' parameter, for example. Though I am not sure if it will affect anything at all.

dmitrymex · 2016-08-11T11:08:03Z

@binarin: overall patch looks good to me, it looks like it should remove load done by our current monitoring.

michaelklishin · 2016-08-17T15:48:46Z

What's the conclusion on this? Should we merge it?

bogdando · 2016-08-18T07:41:04Z

+1, but please remove DO NOT MERGE if you think it's done and it works for you

binarin · 2016-08-18T10:40:12Z

@michaelklishin Please don't merge it yet, I'm still testing.

michaelklishin · 2016-08-22T14:23:24Z

This currently doesn't merge cleanly.

This will stop wasting network bandwidth for monitoring. E.g. a 200-node OpenStack installation produces aronud 10k queues and 10k channels. Doing single list_queues/list_channels in cluster in this environment results in 27k TCP packets and around 12 megabytes of network traffic. Given that this calls happen ~10 times a minute with 3 controllers, it results in pretty significant overhead. To enable those features you shoud have rabbitmq containing following patches: - rabbitmq#883 - rabbitmq#911 - rabbitmq#915

binarin · 2016-08-23T12:01:43Z

Rebased and tested again. Now I'm happy with this patch.

michaelklishin · 2016-08-23T12:03:45Z

Thank you!

This will stop wasting network bandwidth for monitoring. E.g. a 200-node OpenStack installation produces aronud 10k queues and 10k channels. Doing single list_queues/list_channels in cluster in this environment results in 27k TCP packets and around 12 megabytes of network traffic. Given that this calls happen ~10 times a minute with 3 controllers, it results in pretty significant overhead. Upstream change: - rabbitmq/rabbitmq-server#916 To enable those features you shoud have rabbitmq containing following patches: - rabbitmq/rabbitmq-server#883 - rabbitmq/rabbitmq-server#911 - rabbitmq/rabbitmq-server#915 Change-Id: Icfde3360b42a841ad3a219b94f65a69b2a18cea7 Closes-Bug: 1614071

dmitrymex reviewed Aug 10, 2016
View reviewed changes

michaelklishin self-assigned this Aug 11, 2016

dmitrymex reviewed Aug 11, 2016
View reviewed changes

binarin force-pushed the rabbitmq-server-new-shiny-ocf-health-check branch from a81272b to 464f54b Compare August 17, 2016 12:19

binarin force-pushed the rabbitmq-server-new-shiny-ocf-health-check branch from 464f54b to 99f2a48 Compare August 23, 2016 09:27

binarin changed the title ~~DO NOT MERGE Use new rabbitmqctl features for monitoring~~ Use new rabbitmqctl features for monitoring Aug 23, 2016

michaelklishin added this to the 3.6.6 milestone Aug 23, 2016

michaelklishin merged commit 29a12b6 into rabbitmq:stable Aug 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use new rabbitmqctl features for monitoring #916

Use new rabbitmqctl features for monitoring #916

binarin commented Aug 10, 2016

dmitrymex Aug 10, 2016 •

edited

Loading

dmitrymex Aug 11, 2016

dmitrymex commented Aug 11, 2016

michaelklishin commented Aug 17, 2016

bogdando commented Aug 18, 2016 •

edited

Loading

binarin commented Aug 18, 2016

michaelklishin commented Aug 22, 2016

binarin commented Aug 23, 2016

michaelklishin commented Aug 23, 2016

Use new rabbitmqctl features for monitoring #916

Use new rabbitmqctl features for monitoring #916

Conversation

binarin commented Aug 10, 2016

dmitrymex Aug 10, 2016 • edited Loading

Choose a reason for hiding this comment

dmitrymex Aug 11, 2016

Choose a reason for hiding this comment

dmitrymex commented Aug 11, 2016

michaelklishin commented Aug 17, 2016

bogdando commented Aug 18, 2016 • edited Loading

binarin commented Aug 18, 2016

michaelklishin commented Aug 22, 2016

binarin commented Aug 23, 2016

michaelklishin commented Aug 23, 2016

dmitrymex Aug 10, 2016 •

edited

Loading

bogdando commented Aug 18, 2016 •

edited

Loading