-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Flight Plan UI" associated steady memory increase (~1GB per 20 minutes) on some machines leading to KSP instability/unusual texture lighting/slow crash/system out of memory #3064
Comments
Unfortunately I wasn't able to reproduce the memory growth that you are observing starting from your save. So it would really help to get a journal that was taken while the memory was growing. Note that I am not completely sure if I'll be able to replay a journal constructed with Gröbner, but it's worth trying. |
Ok. below is the link to a .zip with the logs & the very large journal I mentioned for the second run above (it compressed to about 1GB from around 9GB): https://drive.google.com/file/d/1XVf2WVPO3GeLJRM8szzRs003P5o01C1z/view?usp=sharing a significant factor with this journal, if readable, is that towards the end it covers the period where I both delete the original flight plan in the save & click "0" button for history, after which point the working set memory, within a reasonable amount of time, returns to the ~2,6GB for the remainder of the session. If you need a shorter journal, please let me know. I will also build a new KSP 1.11.2 & GameData & attempt to replicate on one of the Dell Laptops instead of the ASRock desktop to try to rule out some corruption in the Squad files. As your ideas progress please let me know what you think might be helpful for me to try. For example, assuming I can replicate the issue on a Dell laptop, & since what I have been observing feels like a "confused KSP while flight planning", if you ever assess that there might be something in the save structure that could be confusing KSP...I could make a specific type of flight plan save with a specific particular earlier version of Principia then conducting a similar test like above loading that save with Gröbner or some other version of KSP, etc. Thank you. |
If you are starting a new replication effort, I would recommend to use Grossmann, that would make my life easier for debugging. |
What would you consider to be an unexpected amount of working set memory increase on a machine with 32GB total RAM for such a basic scenario as this type of save? I ask because, needing to demonstrate, say, something like only a 3 GB increase would make repeating the tests much more rapid than waiting for lag, texture changes, etc. I was not actually clear whether or not the working set memory increase I observed was considered to be normal & expected. On both machines, I built new KSP 1.11.2 folders on different drives from new downloads. I used the MunTransfer.sfs provided in the OP. I used Gröbner again to check behavior.
These specific types of freeze, hang, or "crash on ESC-->load save" behaviors I have never seen on these machines in many years working with KSP & Principia prior to this recent issue under investigation. On the ASrock, I rebooted & created a new save with Grossmann. Working set memory followed the same steady increase pattern & the craft textures lost illumination as in the prior tests with Gröbner. KSP UI "froze" when working set memory reached 10GB . Here is a link to a short screen capture video of the 'KSP frozen UI state' (with Process Explorer info) using Grossmann: I am working on a 'less long' journal using Grossmann since a journal gets large rather quickly...link will be pasted in a new comment. If this shorter one is not enough, please let me know what & for how long exactly would you like the journal to capture...examples of some specifics:
|
First, a clarification: I am not really trying to debug the "disappearing textures", as there is no way that I can figure out what's happening in the KSP/Unity/DirectX/GPU stack. Instead, following the celebrated streetlight effect, I am trying to debug the memory growth, which seems abnormal and may (or may not) be related to the disappearing textures. On my machine, KSP+Principia starts with a private working set of 2.5 GiB. Considering that your game doesn't have long trajectories, I would say that things are getting suspicious when the working set gets to 4-5 GiB. Where it gets confusing is that when I replay your 9 GiB journal (and you must understand that replaying the journal doesn't involve KSP at all, it's just exercising the internal mechanics of Principia) the memory peaks at 75 MiB. That's a far cry from a multi-GiB leak. I was still able to identify two leaks, but they are really small. One is 250 B at 50 Hz, the other 4 B at 50 Hz. This will take a day to reach 1 GiB. I have also found places where we are not smart in allocating/deallocating memory. I'll fix all these issues, but this doesn't come close to explaining the memory growth. I am not exactly sure what to recommend here. Maybe a journal generated with Grossmann that exhibits a sizeable memory growth (2-3 GiB) and then a drop when the flight plan goes away would help, but I am grasping at straws. I'll keep looking and I'll update this issue if I find anything interesting. |
Thanks for the clarification. Hopefully the memory trail is not a bad "red herring" like the "streetlight effect"...at least in the environment on my machine, with this same save, KSP by itself stays around 2GB working set. Since we are both somewhat grasping at straws, here at the link below is my initial 'less clean' Grossmann journal...in other iterations I have observed that the memory increases even if I do nothing but keep the flight plan & staying in flight scene...in this iteration I do a partial burn & mostly just let it sit as the memory climb above 5GB before deleting the flight plan at which point memory growth stopped. https://drive.google.com/file/d/1OQNIf3eEMFhlx4fDo4phkq9G2H6Rqke6/view?usp=sharing I assume the journals would reveal if something was up with calls the visual c libraries, but considering the wider environment, I could try reinstalling the visual C dependency if you think that might rule out anything...my logs report 192930037...I saw recently your mention of "pipelines are running with MSVC version 192930038"? as a side note: from running so many iterations now my observation based assessment is that any lighting/texture issue is quite variable in manifestation and thus is likely more just a sign of instability beginning to occur somewhere in the KSP/Unity & associated threads... I'll look forward to your update whether the Grossmann journal reveals anything different...thanks! Until then I guess my next step is that I'm going to need to do the work to observe what memory does with earlier versions of Principia... |
one other thought: while 'memory' might be somewhat of a red herring regarding Principia itself...might there be a way that Principia's use of memory, under some conditions, could eventually result in something related to what "Lisias" describes regarding Unity/KSP's "garbage collection" in this forum post: |
Update:
Here is a shorter journal with Grossmann relevant to the KSP memory management mitigation concept described below...I let memory increase for a few minutes then delete the flight plan, set history to 0, & switch to tracking station & then to space center at which point KSP working set memory returns to the more usual range (~2.7GB in this case): https://drive.google.com/file/d/14jeWNdU7VwYYaJQuRqYMg7vZmpgOmMo0/view?usp=sharing At least now there appears to be a fairly clear draft mitigation procedure for anyone who experiences a large memory increase:
|
I have replayed the journals, and sadly I don't see any anomalous memory consumption: the journal dated 2021-07-18 peaks at 53 MiB, the one dated 2021-07-19 at 20 MiB. I installed the latest Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019 and didn't see any difference. As mentioned previously, journals only exercise part of the Principia code base. In particular they don't run with Unity, so oddities due to Unity are not be reproduced. This has been a problem in the past, see #2899. However, I was also unable to see any abnormal heap growth in the real game. My gut feeling (which I cannot support with facts) is that something changed in the environment of some users that is causing trouble. And Gröbner is just guilty by association because it got released at the same time as that mysterious change (with monthly releases, time coincidences are rather probable). Note that while we have had multiple reports of the problem, it's definitely not the case that all (or a majority) of users are affected. Let's try to eliminate some parts of the software stack:
So I am looking suspiciously at the only remaining part of the stack, the GPU driver and libraries. It's not like something fishy has never happened before in that area, see #2297. At this point I am going to put this issue on the back burner until new information surfaces. I don't think it's productive for you or for me to spend more time on it while I am unable to reproduce the problem. If you want to make one last experiment, try to see if the issue reproduces with an older version, like Grassmann or Green. If it does, it will pretty much exonerate Principia. If it doesn't, well, we'll still be clueless. |
back burner sounds good...one insight to help this simmer into a decent broth stock: at least on the one machine I am observing in detail (=caveat), mostly based on watching "KSP" memory allocations with VMMapx64 & one windows 10 debugger trace, there may be some clarity as to why "reports" are more frequent 'recently' for users prone to "this underlying condition" hypothesis: for quite some time (months? years?), some condition results in the "KSP/Unity/Principia Flight Scene Flight Plan/"Windows 10-DX11?-Nvidia? interface" complex to steadily increase some part of the "heap" portion of the memory allocation... This 'heap growth' appears to directly correlate with the "KSP" instability (at least that I observe) of "this issue". "With Principia Grassmann", the "win10/Unity/KSP/Principia" complex allocates a much smaller initial heap relative to Grossmann plus allocating a much greater % of the memory as "private data" relative to Grossmann. Further, in both heap & private data, the allocations are in a significantly higher % of "shared" forms of memory "with Grassmann" relative to "with Grossmann". From my in game observations, this "smaller initial heap, etc." of the complex "with Grassmann" indeed appears to delay substantially the in game signs of "trouble" that appear when "the heap grows unusually large". Consequently, I at least never noticed it before...since I would have completed the flight plan & changed scenes well before "symptoms" became easy to spot. The changes in memory allocation for the complex "with Grossmann" appear to just make the underlying condition much sooner/easier to observe in game... I remain clueless as to the 'why' regarding the underlying condition of "heap growth in the 'presence' of a Principia flight scene flight plan" that gets released with flight plan deletion & scene change in the "KSP/Unity/Principia Flight Scene Flight Plan/'Windows 10-DX11?-Nvidia? interface' " complex. |
At least for those of us that experience this we can mitigate it even further by making sure we ONLY have the Flight plan UI open on the screen when we are actively making changes to the flight plan & otherwise leave it closed/minimized. I understand this issue is at the bottom of the barrel, but I wanted practice with the win10 debugger tools anyway so here are some interesting updates: Very odd that only some users experience this...The memory behavior is very reliable on the 3 machines I have been using to evaluate it...they have different boards, memory, BIOS, chipsets, and gpus however all are Ivy Bridge series processors (sadly I do not have access to later series chips where I can run KSP). Occurs both with KSP d3dx11 as well as if KSP is forced to use OpenGL with the -force-glcore flag. Also, occurs both with the Nvidia cards as well as with the Intel iGPU. This enabled me to narrow down that this symptom, that at least a few of us users are experiencing, is directly related to the flight plan UI being open with at least a flight plan created: i.e. memory does not increase when the the flight plan UI is open alone without a created flight plan, nor does it increase if the flight plan UI is minimized/closed even when containing a created flight plan. Reliable control:
speculation: From the memory analyses linked below, it has the feel that memory is being repeatedly & steadily called for something like UI "textures/shaders" since the memory grows only when the UI is 'open on the screen' but rather than this "texture/shader memory" being freed when the UI is 'minimized/closed' by the 'flight plan' button in the main Principia window it is held all the way until the entire flight plan is deleted...(and, for whatever reason Windows does not fully release it until changing scene in KSP). Using the Intel iGPU, I did a sequence of leak tracking dumps on the same KSP process (1.12.2 in this case but pattern is similar in KSP 1.10.1 & 1.8.1 & similar using either d3d11 or OpenGL on the Nvidia RTX2070): album showing the 1st page of each memory analysis report: d3d11!NOutermost::CDevice::CreateLayeredChild+e5 keeps increasing the most, (again, a similar growth is reported if using OpenGL but with a different .dll of course).
memory analysis report shows the interesting sequence of related calls (opens in Internet Explore since the MS debugdiag .mht uses active content): I also looked at the dates of various clusters of changes in the change history for the Principia "flight_planner.cs" and tried some older Principia versions before various changes:
|
FWIW I'm experiencing the same problem on Ubuntu 20.10 x64. Seems to be independent of KSP or Principia versions, but I didn't get into any of the gory details provided by Growflavor. I'll try to dig up the goods on my GPU, etc. |
My hardware info: |
@larryfast
@pleroy I see that you are working on other much more great issues & features. However, at some point convenient to your workflow, I would be grateful for the following .PML from a machine where the memory growth symptom is not present. Ideally while launching KSP 1.12.2 with default settings.
(I really hope it is something I can fix locally like that...for adults whose machines exhibit this approx. 1GB per 20 minute increase in private working set KSP memory use, mitigating it may be less of an issue, but reminding young people to open & close windows or losing their work when KSP becomes unstable is a real distraction for me.) |
I'm running KSP on Ubuntu. The process described seems to be windows based. Happy to help if it's something I can do on Linux. Sadly I have to leave this Snark hunt to others. |
@pleroy there is at least one odd thing, a series of requests that contains a "\GameData\Principia\GameData\Principia\x64" path duplication, that I see in the PMLs for KSP run on my main machine. I also looked at it in realtime in Process Monitor today with a fresh KSP setup. I mention it because a slew of these occur when the Flight Plan UI is opened the 1st time after resuming a saved game. Here are two examples of the stack Process Monitor reports for two of the calls when the Flight Plan UI was first opened: ProcessMonitorFastIODuplicatePrincipiaPathOpenFlightPlanUIStackExample.zip Unfortunately, there appear to be other times (perhaps when other UIs are oppened) that this double path occurs as well, not just with the Flight PLan UI...however there are a lot with the flight plan UI. So I decided it is worth mentioning (hundreds of instances through out the latter part of the above PML I already shared for example) in case it sparks any fruitful ideas:
edit: just to be clear,
I am willing to do a comparision with a PML for KSP run from your machine...(the AstroGrep & WinMerge tools work well when I convert the PML to CSV in Process Monitor). Sadly, the only other thing I see so far in real time with Process Monitor is the thread create & exit Orbit analysis activity just when the main Principia UI is open, however, that seems to be normal/expected & does not appear to cause any memory allocation growth. I'll look forward to your further insights when you have time. Thank you! |
I have seen similar (weird) paths when debugging unrelated issues. I think this is just that KSP sets a search path that looks in multiple places, and thus ends up constructing these nested paths. Nothing to see here, I guess. |
So I looked again at this problem for the first time since July, and, interestingly, I was able to reproduce it using the latest release, Halley (no idea why I couldn't the first time around). The heap debugger was not very informative because all the memory managed by Principia looked fine, but there was a suspicious increase in memory related to Unity. So I went for a different approach, and started to comment out bits and pieces of code in the flight planner UI, running the game for 5 minutes from the After a rather long and protracted process, I was able to narrow down the problem to the display of horizontal lines in the flight plan UI. It appears that, when displaying a horizontal line for a single frame, Unity irreversibly burns about 9 kiB of heap. Since the problem is in Unity, I can easily see imagine that it could lead to corruption of various parts of the UI (e.g., the "Cheshire Cat" navball). I also presume that other windows where we display horizontal lines (e.g., the orbit analyzer) would lead to similar issues. A slight change to the way that we display these horizontal lines seems sufficient to prevent Unity stupidity. Meanwhile, the Internet being an inexhaustible source of wisdom, there is a meme for that. |
Several users on Discord & the KSP forum have recently observed an unusual cluster of symptoms...this issue is to help identify the interaction/cause mix of Unity, KSP, user hardware, normal expected Principia memory use, or bugs in any of these.
Apparent observations reported so far:
Using just KSP 1.11.2 with Principia Gröbner & two test saves provided below, I have observed working set memory to increase from an order of 2GB to an order of 8GB (on a machine with 32GB total RAM) at which point KSP hung during a slow time warp while the orbit analyzer was open for a simple vessel in Mun orbit.
After the 1st KSP hang/crash, I then reloaded KSP & repeated the flight with journaling on & observed the same working set memory increase up to close to 8GB and then decided to delete the flight plan & set history to 0 & to build a new flight plan.
test save used for the 1st replication & video:
PrincipiaMun60kFreeFlyBy.zip
test save (created during the 1st video WITHOUT journaling) but used with journaling ON for 2nd replication & video:
MunTransfer.zip
The text was updated successfully, but these errors were encountered: