CPU usage increased dramatically 0.8.1-RC1 -> master #1497

bobrik · 2015-05-09T13:56:48Z

I built and deployed 964e430. Running against 0.22.1 masters:

Upgrade started at 16:20, last node was updated at 16:31. Revert to 0.8.1-RC1 happened at 16:41.

I mentioned performance in #1472 as well.

The text was updated successfully, but these errors were encountered:

kolloch · 2015-05-11T16:23:17Z

thanks for reporting this. With which Mesos library version (native) are you using with Marathon? Note that we updated the Dockerfile to use the latest Mesos-Libraries just today because the corresponding base image was just recently released.

@lloesche have you seen something similar?

bobrik · 2015-05-11T16:27:23Z

Turns out that I was using 0.22.0 as a base image, will try with 0.22.1.

bobrik · 2015-05-11T16:59:03Z

Nah, it's still bad:

Elasticsearch has hot_threads api, do you have something similar so I can give more meaningful data?

kolloch · 2015-05-12T08:00:43Z

You can use poor man's profiling tool: Could you use jstack a couple of times on the Marathon process and send us the stack traces?

> jstack <MARATHON_PID> >stackX.txt
> jstack <MARATHON_PID> >stackX.txt
> jstack <MARATHON_PID> >stackX.txt
...

You might have to enter the process space with docker exec to do that.

bobrik · 2015-05-12T08:23:40Z

Here it is: https://gist.github.com/bobrik/87b8903cc3d502afe888

drexin · 2015-05-12T10:47:39Z

@bobrik I couldn't reproduce this. I ran v0.8.2-RC2 in a docker and started 100 tasks without the Marathon process using considerably more than 10% CPU. What does your setup look like?

bobrik · 2015-05-12T10:50:42Z

Marathon is running on Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz, 8 cores.

Health checks are running once per 3-5 seconds.

drexin · 2015-05-12T16:06:47Z

@bobrik Woud it be possible for you to change the health checks to COMMAND checks that call curl and see if the CPU usage is still that high then?

bobrik · 2015-05-12T21:28:03Z

I tried, but it didn't work:

        healthChecks:
          - protocol: COMMAND
            command:
              value: "curl -f -X GET http://$HOST:$PORT0/?n=marathon_healthcheck"
            gracePeriodSeconds: 15
            maxConsecutiveFailures: 300
            intervalSeconds: 2
            timeoutSeconds: 5

{"log":"[2015-05-12 21:26:54,159] INFO Received status update for task topface_lenny-test.67ca9fbb-f8ec-11e4-ab20-56847afe9799: TASK_RUNNING () (mesosphere.marathon.MarathonScheduler:148)\n","stream":"stdout","time":"2015-05-12T21:26:54.159345729Z"}
{"log":"[2015-05-12 21:26:54,160] INFO Received status for [topface_lenny-test.67ca9fbb-f8ec-11e4-ab20-56847afe9799] with version [2015-05-12T21:18:15.625Z] and healthy [false] (mesosphere.marathon.health.MarathonHealthCheckManager:150)\n","stream":"stdout","time":"2015-05-12T21:26:54.160155174Z"}
{"log":"[2015-05-12 21:26:54,160] INFO Forwarding health result [Unhealthy(topface_lenny-test.67ca9fbb-f8ec-11e4-ab20-56847afe9799,2015-05-12T21:18:15.625Z,,2015-05-12T21:26:54.160Z)] to health check actor [Actor[akka://marathon/user/$N#1518643597]] (mesosphere.marathon.health.MarathonHealthCheckManager:171)\n","stream":"stdout","time":"2015-05-12T21:26:54.1603141Z"}

No more info is provided to resolve the issue. Task is healthy and works with http check. Can be related to #1380.

bobrik · 2015-05-12T21:36:43Z

        healthChecks:
          - protocol: COMMAND
            command:
              value: "true"
            gracePeriodSeconds: 15
            maxConsecutiveFailures: 300
            intervalSeconds: 2
            timeoutSeconds: 5

120 instances, 0.8.1-RC1. Not sure if upgrading to master wouldn't kill marathon completely.

bobrik · 2015-05-13T06:43:30Z

Probably worth mentioning: each of 3 marathons receives 1 rps for /v2/apps?embed=apps.tasks for service discovery reasons, in total it's 3 rps to master.

kolloch · 2015-05-13T08:56:58Z

I guess these queries are more or less the only thing I see in the stack traces that sticks out:

"qtp265226115-746" prio=10 tid=0x00007f0964023000 nid=0xd88 waiting on condition [0x00007f091bbf8000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007dafdd118> (a scala.concurrent.impl.Promise$CompletionLatch)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
    at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
    at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
    at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
    at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.result(package.scala:190)
    at mesosphere.marathon.api.RestResource$class.result(RestResource.scala:44)
    at mesosphere.marathon.api.v2.AppsResource.result(AppsResource.scala:29)
    at mesosphere.marathon.api.v2.AppsResource.mesosphere$marathon$api$v2$AppsResource$$enrichedTasks(AppsResource.scala:294)
    at mesosphere.marathon.api.v2.AppsResource$$anonfun$5.apply(AppsResource.scala:57)
    at mesosphere.marathon.api.v2.AppsResource$$anonfun$5.apply(AppsResource.scala:55)
    at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:418)
    at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:418)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1222)
    - locked <0x00000007dafdbf40> (a scala.collection.immutable.Stream$Cons)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1212)
    at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:418)
    at scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:418)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1222)
    - locked <0x00000007dafdbf90> (a scala.collection.immutable.Stream$Cons)
    at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1212)
    at scala.collection.immutable.Stream.foreach(Stream.scala:595)
    at play.api.libs.json.JsValueSerializer.serialize(JsValue.scala:311)
    at play.api.libs.json.JsValueSerializer$$anonfun$serialize$2.apply(JsValue.scala:320)
    at play.api.libs.json.JsValueSerializer$$anonfun$serialize$2.apply(JsValue.scala:318)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at play.api.libs.json.JsValueSerializer.serialize(JsValue.scala:318)
    at play.api.libs.json.JsValueSerializer.serialize(JsValue.scala:302)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:114)
    at com.fasterxml.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:1887)
    at play.api.libs.json.JacksonJson$.generateFromJsValue(JsValue.scala:495)
    at play.api.libs.json.Json$.stringify(Json.scala:51)
    at play.api.libs.json.JsValue$class.toString(JsValue.scala:80)
    at play.api.libs.json.JsObject.toString(JsValue.scala:166)
    at mesosphere.marathon.api.v2.AppsResource.index(AppsResource.scala:87)
    at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
    at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
    at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
    at com.codahale.metrics.jersey.InstrumentedResourceMethodDispatchProvider$TimedRequestDispatcher.dispatch(InstrumentedResourceMethodDispatchProvider.java:30)
    at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
    at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
    at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
    at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:540)
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:715)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
    at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
    at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
    at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
    at mesosphere.marathon.api.CacheDisablingFilter.doFilter(CacheDisablingFilter.scala:18)
    at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
    at mesosphere.marathon.api.CORSFilter.doFilter(CORSFilter.scala:46)
    at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
    at mesosphere.marathon.api.LeaderProxyFilter.doFilter(LeaderProxyFilter.scala:56)
    at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at com.codahale.metrics.jetty8.InstrumentedHandler.handle(InstrumentedHandler.java:192)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:370)
    at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
    at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:745)

I don't know yet why this would have changed recently.

bobrik · 2015-05-13T13:57:09Z

I've shut down service discovery updater for a few minutes on 0.8.1-RC1 and look at that:

The second decrease in cpu usage on the graph happened when I closed browser tab with marathon.

bobrik · 2015-05-14T09:44:36Z

Flamegraphs: https://gist.github.com/bobrik/969d322bb28c6a649cf7

https://github.com/jrudolph/perf-map-agent

0.8.1-RC1:

0.8.2-SNAPSHOT:

Blue line, higher is 0.8.2:

dgromov · 2015-05-17T15:01:30Z

@drexin, were you using OpenJDK when you tried reproducing this? I wonder if that had something to do with it. https://gist.github.com/bobrik/87b8903cc3d502afe888 suggests that these numbers are all using that.

apuckey · 2015-05-18T04:57:36Z

I'm using Sun Java 1.7 not openjdk if that makes a difference

kolloch · 2015-05-18T17:13:33Z

Hi @bobrik, BTW I hope that it is clear that we really appreciate your detailed reporting. Unfortunately, we can't reproduce it so far.

I could imagine that it has something to do with the Mesos Library changes. Did you try 0.8.1-RC1 with the 0.22.1 mesos libraries by any chance? I am not sure if that is a supported configuration but if that exhibits the same CPU pattern, the reason could lie in the new Mesos Library version.

I think it might make sense to implement #1539 soon and check if your problems persist.

What do you think?

bobrik · 2015-05-18T19:23:54Z

Should I just try 0.8.1-RC1 tag on top of mesosphere/mesos:0.22.1 docker image?

Removing native code sounds like a good idea, too much is happening there.

kolloch · 2015-05-19T08:29:57Z

Hi @bobrik, if it's not a big hassle (at least in comparison to the things you have already done), trying 0.8.1-RC1 on top of mesosphere/mesos:0.22.1 would be grand. 👍

bobrik · 2015-05-19T10:04:09Z

Looks like marathon itself is to blame and 0.22.1 is even better with 0.8.2-SNAPSHOT. Snapshot is from today's master, btw.

~ 12:10 marathon 0.8.1-RC1, mesos 0.22.1
~ 12:25 marathon 0.8.1-RC1, mesos 0.22.0
~ 12:38 marathon 0.8.2-SNAPSHOT, mesos 0.22.1
~ 12:47 marathon 0.8.2-SNAPSHOT, mesos 0.22.0

bobrik · 2015-05-21T11:50:07Z

Ok, I'll try to collect metrics from master on 0.8.1-RC1 and 0.8.2-RC3 with --enable_metrics after 30 minutes of regular load.

Meanwhile, can you tell me what is needed from zk when I ask for /v2/apps?embed=apps.tasks? I thought that marathon keeps everything in memory even though state is written to zk for recovery.

kolloch · 2015-05-21T11:55:08Z

Hi @bobrik, actually, reads currently go to Zookeeper as well. We want to change that. Basically, it is also a trade-off between looking at current user problems (which needs time) and rewriting some of the code (which needs time) which might actually solve these issues anyway.

So, without wanting to sound smart, analyzing this issue actually prevents me from rewriting code. But I do not like to release 0.8.2 before we understand the implications.

bobrik · 2015-05-21T11:55:32Z

Here are the graphs from zk cluster, looks suspicious if you ask me:

There is a lot smaller load from 0.8.1-RC1, especially in terms of bytes per second.

kolloch · 2015-05-21T13:29:10Z

Hi @bobrik,

I cannot really make sense of your graphs. What's the old, what's the new version? What makes you suspicious?

Can you actually tell us the configuration parameters you start marathon with? I assume, they are the same between the old and the new version? Thanks.

bobrik · 2015-05-21T14:06:59Z

Sorry for not making it clear. Environment vars for marathon (in an ansible playbook):

          MARATHON_MASTER: zk://web488:2181,web489:2181,web490:2181/mesos
          MARATHON_ZK: zk://web488:2181,web489:2181,web490:2181/marathon-new
          MARATHON_ZK_MAX_VERSIONS: 10
          MARATHON_HOSTNAME: "{{ inventory_hostname }}"

They are the same for both versions.

Now to the graphs, new ones this time, hope they are more clear. Here I ran 0.8.2-RC3 for 40 minutes, then 0.8.1-RC1 for 40 minutes, then 0.8.1-RC1 on top of 0.22.1 libs for 10 minutes.

Marathon cluster:

Zookeeper cluster for marathon and mesos, same time:

Enormous difference in the used bandwidth to zookeeper is suspicious: 200 kb/s vs 5000 kb/s. CPU load and packet rate are alos higher with 0.8.2-RC3.

Metrics for this, for 0.8.2 with --enable-metrics: https://gist.github.com/bobrik/96997d0030338fa1dc15

Does it make sense now? Thank you for your patience.

….statuses and make MarathonHealthCheckManager data structures more efficient

kolloch · 2015-05-26T16:19:04Z

Hi @bobrik,

thanks to your extensive reporting, we found the offender. The metrics in the gist helped us out.

If you are really adventurous, you can checkout the pk/1497_health_statuses_more_efficient branch. Otherwise, you can wait for us to merge to master and release another RC.

bobrik · 2015-05-26T17:12:01Z

Thanks, I'll test it tomorrow. Is there an issue to remove unnecessary zk read requests?

bobrik · 2015-05-27T09:22:22Z

Marathon master built from origin/pr/1568, running from 12:09:

ZK traffic didn't change compared to 0.8.1-RC1, though. Workload is slightly different since I query more groups now.

kolloch · 2015-05-27T09:27:13Z

Hi @bobrik, that is surprising. Can you export the metrics for us again?

bobrik · 2015-05-27T09:40:02Z

Metrics after 10 minutes: https://gist.github.com/bobrik/bb8b852eb1156624a3b8

kolloch · 2015-05-27T10:29:04Z

I'll try to summarize the findings (correct me if I am wrong).

When

using the proposed fixed version instead of 0.8.1-RC1
with the same load (which is different from the load you tested with before)

you see that

The CPU usage goes up from ~600ms to ~740ms .
The ZK traffic doesn't change significantly.

So the new version still uses more CPU than 0.8.1-RC1 but is otherwise fine.

The increased CPU could be potentially explained by more requests. At least in the last comparison with full metrics for some reason we saw significantly more requests against the new version, maybe because of faster response times. The new version has a mean response time of 35ms for AppResource.index compared to 126ms for the old version.

If that is correct, we would like to release a new RC with the fix.

…_efficient Fixes #1497 - Do not query app versions in MarathonHealthCheckManager

bobrik · 2015-05-27T11:14:59Z

The increased CPU could be potentially explained by more requests.

0.8.1-RC1:

Document Path:          /v2/apps?embed=apps.tasks
Document Length:        73417 bytes

Concurrency Level:      10
Time taken for tests:   51.741 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      73581000 bytes
HTML transferred:       73417000 bytes
Requests per second:    19.33 [#/sec] (mean)
Time per request:       517.413 [ms] (mean)
Time per request:       51.741 [ms] (mean, across all concurrent requests)
Transfer rate:          1388.76 [Kbytes/sec] received

PR 1568:

Document Path:          /v2/apps?embed=apps.tasks
Document Length:        76127 bytes

Concurrency Level:      10
Time taken for tests:   84.561 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      76291000 bytes
HTML transferred:       76127000 bytes
Requests per second:    11.83 [#/sec] (mean)
Time per request:       845.606 [ms] (mean)
Time per request:       84.561 [ms] (mean, across all concurrent requests)
Transfer rate:          881.06 [Kbytes/sec] received

Much better than master, but still worse than 0.8.1-RC1.

With reduced background usage (only /v2/apps?embed=apps.tasks at 3rps) CPU usage is roughly the same at 320ms, but max RPS differ:

0.8.1-RC1:

Document Path:          /v2/apps?embed=apps.tasks
Document Length:        72913 bytes

Concurrency Level:      10
Time taken for tests:   47.749 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      73077000 bytes
HTML transferred:       72913000 bytes
Requests per second:    20.94 [#/sec] (mean)
Time per request:       477.487 [ms] (mean)
Time per request:       47.749 [ms] (mean, across all concurrent requests)
Transfer rate:          1494.58 [Kbytes/sec] received

PR 1568:

Document Path:          /v2/apps?embed=apps.tasks
Document Length:        75623 bytes

Concurrency Level:      10
Time taken for tests:   70.917 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      75787000 bytes
HTML transferred:       75623000 bytes
Requests per second:    14.10 [#/sec] (mean)
Time per request:       709.175 [ms] (mean)
Time per request:       70.917 [ms] (mean, across all concurrent requests)
Transfer rate:          1043.62 [Kbytes/sec] received

Go ahead with RC, i'll reduce the load with label selectors, sse and probably mesos api as the source of truth.

kolloch · 2015-05-27T13:16:20Z

Hi @bobrik, the strange thing is that the metrics that you gave us earlier told a different story.

If you want, you can still send us the related metrics and I'll have a look.

We will not release master as an RC but the old RC with only this single fix. Maybe it works better, maybe not.

….statuses and make MarathonHealthCheckManager data structures more efficient

drexin self-assigned this May 12, 2015

bobrik mentioned this issue May 15, 2015

--zk_max_versions doesn't apply to group versions? #1533

Closed

kolloch added this to the 0.8.2 milestone May 18, 2015

kolloch assigned aquamatthias and unassigned drexin May 18, 2015

kolloch assigned kolloch and unassigned aquamatthias May 19, 2015

kolloch added blocker labels May 19, 2015

kolloch added the in progress label May 21, 2015

kolloch removed the analyze label May 26, 2015

kolloch pushed a commit that referenced this issue May 26, 2015

Fixes #1497 - Do not query app versions in MarathonHealthCheckManager…

d8cc929

….statuses and make MarathonHealthCheckManager data structures more efficient

kolloch pushed a commit that referenced this issue May 26, 2015

Fixes #1497 - Do not query app versions in MarathonHealthCheckManager…

8c5a0cf

….statuses and make MarathonHealthCheckManager data structures more efficient

kolloch added ready for review and removed in progress labels May 26, 2015

kolloch added in progress and removed ready for review labels May 27, 2015

kolloch added ready for review and removed in progress labels May 27, 2015

aquamatthias closed this as completed in 4279b6b May 27, 2015

aquamatthias added a commit that referenced this issue May 27, 2015

Merge pull request #1568 from mesosphere/pk/1497_health_statuses_more…

aa13a8b

…_efficient Fixes #1497 - Do not query app versions in MarathonHealthCheckManager

aquamatthias removed the ready for review label May 27, 2015

kolloch mentioned this issue May 27, 2015

Report number of apps/groups in metrics #1573

Closed

kolloch pushed a commit that referenced this issue May 27, 2015

Fixes #1497 - Do not query app versions in MarathonHealthCheckManager…

8368d21

….statuses and make MarathonHealthCheckManager data structures more efficient

d2iq-archive locked and limited conversation to collaborators Mar 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU usage increased dramatically 0.8.1-RC1 -> master #1497

CPU usage increased dramatically 0.8.1-RC1 -> master #1497

bobrik commented May 9, 2015

kolloch commented May 11, 2015

bobrik commented May 11, 2015

bobrik commented May 11, 2015

kolloch commented May 12, 2015

bobrik commented May 12, 2015

drexin commented May 12, 2015

bobrik commented May 12, 2015

drexin commented May 12, 2015

bobrik commented May 12, 2015

bobrik commented May 12, 2015

bobrik commented May 13, 2015

kolloch commented May 13, 2015

bobrik commented May 13, 2015

bobrik commented May 14, 2015

dgromov commented May 17, 2015

apuckey commented May 18, 2015

kolloch commented May 18, 2015

bobrik commented May 18, 2015

kolloch commented May 19, 2015

bobrik commented May 19, 2015

bobrik commented May 21, 2015

kolloch commented May 21, 2015

bobrik commented May 21, 2015

kolloch commented May 21, 2015

bobrik commented May 21, 2015

kolloch commented May 26, 2015

bobrik commented May 26, 2015

bobrik commented May 27, 2015

kolloch commented May 27, 2015

bobrik commented May 27, 2015

kolloch commented May 27, 2015

bobrik commented May 27, 2015

kolloch commented May 27, 2015

CPU usage increased dramatically 0.8.1-RC1 -> master #1497

CPU usage increased dramatically 0.8.1-RC1 -> master #1497

Comments

bobrik commented May 9, 2015

kolloch commented May 11, 2015

bobrik commented May 11, 2015

bobrik commented May 11, 2015

kolloch commented May 12, 2015

bobrik commented May 12, 2015

drexin commented May 12, 2015

bobrik commented May 12, 2015

drexin commented May 12, 2015

bobrik commented May 12, 2015

bobrik commented May 12, 2015

bobrik commented May 13, 2015

kolloch commented May 13, 2015

bobrik commented May 13, 2015

bobrik commented May 14, 2015

dgromov commented May 17, 2015

apuckey commented May 18, 2015

kolloch commented May 18, 2015

bobrik commented May 18, 2015

kolloch commented May 19, 2015

bobrik commented May 19, 2015

bobrik commented May 21, 2015

kolloch commented May 21, 2015

bobrik commented May 21, 2015

kolloch commented May 21, 2015

bobrik commented May 21, 2015

kolloch commented May 26, 2015

bobrik commented May 26, 2015

bobrik commented May 27, 2015

kolloch commented May 27, 2015

bobrik commented May 27, 2015

kolloch commented May 27, 2015

bobrik commented May 27, 2015

kolloch commented May 27, 2015