-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Principia is slow on macOS (possibly due to Unity's allocator) #2899
Comments
This is a very interesting finding, and it would explain why people keep complaining about performance on macOS (we don't play on that platform). What is not clear to me is if the Great Mutex of Death™ is in the Unity custom allocator or deeper, in the system |
Update: I modified the |
Very nice progress, congratulations!
Note that it may or may not be necessary to change all the containers: my guess is that a small number of data structures (mostly |
I was able to replace std::allocator with custom allocator without modifying the existing source by first defining some aliases in namespace Unfortunately, unique_ptrs are not affected by this approach as they do not use allocators (they only contain a deleter template parameter). I don't see any way to safely fix the unique_ptrs without changing |
I've gotten to a state that I'm happy with (I no longer see Unity code being called by Principia code in traces). The only necessary unique_ptr replacements were those pertaining to ContinuousTrajectory and DiscreteTrajectory, as predicted. I'll clean up my changes and send them for review. The game is much more playable now :) |
TLDR: Unity seems to have a custom allocator that uses mutexes. Mutexes are slow on macOS. Consequently, on macOS the vast majority of Principia's time seems to be spent on memory management.
After getting back into KSP after a hiatus, I found the game to have poor performance. Running a trace led me to discover a bug in
![Screen Shot 2021-02-23 at 6 05 12 PM](https://user-images.githubusercontent.com/71856888/108934408-b8a81480-7601-11eb-922b-ba45d3bbe7ae.png)
Vessel::RepeatedlyFlowPrognostication
but even after fixing it (#2898), the poor performance persisted. Sometimes the game would even pause completely for several seconds. Further traces revealed the problem was due to four threads concurrently evaluatingOrbitAnalyser::AnalyseOrbit
. Profiling an AnalyseOrbit benchmark revealed nothing amiss.However, I was able to get a flame graph (attached) of a running game which revealed the problem:
The regions highlighted in magenta are mutexes used by functions in
UnityPlayer.dylib
. Based on the placement of theUnityPlayer.dylib
functions in the call stack (and the fact that one of them isoperator new
), I suspect they are a custom allocator used by Unity. Unfortunately, the performance of the stock mutex on macOS is known to be bad (#1955). Thus, terrible performance. The reason this was not apparent in benchmarks is probably because the benchmarks are not run from Unity and hence use the system allocator. This isn't restricted toAnalyzeOrbit
either; the rest of Principia was also affected. I estimate that over 80% of Principia's CPU cost on my machine is spent managing these mutexes!Something similar was diagnosed and fixed in #1955. Unfortunately, this time the slow mutexes are not Principia's but Unity's mutexes. Replacing the misbehaving mutexes with
absl::Mutex
is not really an option here.I will try to fix this by forcing Principia to use the system allocator even when run from Unity. Stay tuned.
Possibly related to #2247.
The text was updated successfully, but these errors were encountered: