Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Game freezes for long periods of time #2957

Closed
pleroy opened this issue Apr 23, 2021 · 5 comments · Fixed by #2959
Closed

Game freezes for long periods of time #2957

pleroy opened this issue Apr 23, 2021 · 5 comments · Fixed by #2959
Milestone

Comments

@pleroy
Copy link
Member

pleroy commented Apr 23, 2021

Yesterday I experienced a long freeze after clicking Delete on the last manœuvre of a flight plan. Excerpt from the log:

I0422 21:02:29.189894 15804 pile_up.cpp:81] Constructing pile up at 00000001027E2210
I0422 21:02:29.189894 15804 part.cpp:219] Adding part PotatoRoid (ED279457) to the pile up at 00000001027E2210
I0422 21:02:35.641894  5132 vessel.cpp:567] Prognostication from +9.29382870146968365e+08 s finished at +1.45488000000000000e+09 s with DEADLINE_EXCEEDED: Couldn't reach +inf s, stopping at +1.45488000000000000e+09 s for #autoLOC_501232 (5569664e-39fe-41a1-863d-5a26dcc92919)
I0422 21:02:45.401894  5132 vessel.cpp:567] Prognostication from +9.29382872186966419e+08 s finished at +1.55988000000000000e+09 s with DEADLINE_EXCEEDED: Couldn't reach +inf s, stopping at +1.55988000000000000e+09 s for #autoLOC_501232 (5569664e-39fe-41a1-863d-5a26dcc92919)
E0422 21:04:06.604894 15804 interface_body.hpp:479] Reached maximum step count 1000 at time +9.38477598398219109e+08 s; requested t_final is +9.46675137203982592e+08 s.
I0422 21:04:08.929894  5132 vessel.cpp:567] Prognostication from +9.29382874426964283e+08 s finished at +1.66488000000000000e+09 s with DEADLINE_EXCEEDED: Couldn't reach +inf s, stopping at +1.66488000000000000e+09 s for #autoLOC_501232 (5569664e-39fe-41a1-863d-5a26dcc92919)
I0422 21:04:20.381705  5132 vessel.cpp:567] Prognostication from +9.29382877106961727e+08 s finished at +1.76988000000000000e+09 s with DEADLINE_EXCEEDED: Couldn't reach +inf s, stopping at +1.76988000000000000e+09 s for #autoLOC_501232 (5569664e-39fe-41a1-863d-5a26dcc92919)

Note how it goes in a coma for 80 s. During that time the CPU usage was solid at 25%, i.e., one thread using as much CPU as it could get. To be honest the version is 2021041203-Grassmann-61-g7b25619052da0f37d774ff3340f4aa4a09b51d61 so I could have broken something.

Today @lpgagnon reports on Discord:

is it expected that changing the Duration in the Orbit Analyzer now freezes the game while it recomputes? it's making longer durations sort of unusable
(10d in lowish moon orbit: maybe a 1s hiccup. 200d: wait 5 minutes for the game to un-hang)

Note that 200 days in 5 minutes is a warp factor of 60 000× for a single vessel. This is pretty slow.

I'm not sure why/how we would burn CPU on the UI thread, but I am suspecting the orbit analyser. This may also relate to #2953: if we do something expensive on Delete, it is conceivable that it would burn memory.

May or may not be related: the progress bar seems broken.

@lpgagnon: any change of getting a journal?

@lpgagnon
Copy link

journal

Steps:

  • from tracking station, enable journaling and jump to vessel 0013 moonshot orbit, a probe in a ~80km x 600km near-polar orbit
  • click Analysis button
  • default Duration is 7 days. Select the 7, replace by 10, click somewhere to apply
  • game freezes (screen stops updating) for a second or so
  • change duration again, to 10d1h
  • game freezes for 4 or 5 seconds
  • turn off journaling

Duration of freeze is related in some way to the duration entered, but it's not clear how. The impression I'm getting is that it's "lagging", the freeze duration corresponding to the previous value in the Duration field rather than the updated value.

CPU usage drops during the freeze (from ~50% to ~20%); RAM usage seems basically stable (climbing at ~1MB/s, but that's also the case when running normally)

@pleroy
Copy link
Member Author

pleroy commented Apr 23, 2021

The impression I'm getting is that it's "lagging", the freeze duration corresponding to the previous value in the Duration field rather than the updated value.

The impression I'm getting is that something funky is happening at the destruction of the analyzer thread, which would be consistent with this lag.

@pleroy
Copy link
Member Author

pleroy commented Apr 23, 2021

Had some trouble replaying the journal, but that will be the subject of another issue.

24% of the CPU spent in LInfinityError. Something is wrong with downsampling.

@pleroy
Copy link
Member Author

pleroy commented Apr 24, 2021

I found two problems:

  1. The computation of the force between celestials is missing a RETURN_IF_STOPPED. This isn't what is causing trouble in the journal above, but I noticed it when reading the code. This seems wrong.
  2. The integrators' Solve functions (all of them) are not stoppable. When they get an error status (which they get when RETURN_IF_STOPPED is called) they record it but they try to continue until the desired end time. I believe that this is an invariant that we established early on.

The bottom line is that, when Interrupt is called on the OrbitAnalyser the rest of the system replies "Well, that's just, like, your opinion, man..." and the UI thread is effectively blocked.

@eggrobin
Copy link
Member

re. 1., if the computation of the force between celestials takes noticeable time we have a problem; let’s not pepper RETURN_IF_ERROR into this very tight loop, once per integrator step is frequent enough.
2. is a problem, but it should have been a problem for a long time ; it is possibly made worse by the now-slow Append I suppose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants