Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream Simulation Resources #1507

Merged
merged 20 commits into from
Jul 24, 2024
Merged

Stream Simulation Resources #1507

merged 20 commits into from
Jul 24, 2024

Conversation

Mythicaeda
Copy link
Contributor

@Mythicaeda Mythicaeda commented Jul 18, 2024

Description

Should not be breaking.

Extracts resource management out of the engine and into a new class called the SimulationResourceManager the InMemorySimulationResourceManager operates nearly the same way as on develop, except that the serialized form of the profile segment is kept around, rather than the pre-serialized form. The Simulate action uses the StreamingSimulationResourceManager, which streams profile segments to the DB in batches.

Additionally, the engine has been slightly refactored:

  • the step and init logic which was repeated verbatim in all simulate methods has been pulled into methods in the engine
  • the engine now tracks its own timeline, cells, and elapsedTime
    • related, the logic of tracking mutliple timelines for checkpoint simulation has been simplified into the engine containing a referenceTimeline, which is defined as the referenceTimeline combined with the timeline of the engine it was copied from
  • computeResults and related methods have been made non-static, as they require an instance of a simulationEngine to execute and now rely on internal fields of the engine.

Additionally, since all but the SynchronousSimulationAgent are unused, the interface SimulationAgent has been flattened.

Verification

Manually verified that streaming is working.

Did some profiling on streaming vs develop. I used Clipper’s gnc-all-activities plan and ran profiling twice, once with high-overhead memory instrumentation and once with async sampling. Findings:

  • Streaming has a positive impact on overall Heap memory usage. From what I can tell, the heap usage is always <= develop, with the difference being ~0.1GB, and given that the max heap usage is ~1GB, that’s up to a 10% save (tho its closer to a 0% save while streaming). I’d definitely prefer to do this comparison again against a plan that consumes more memory, though.
    • I took a look at the breakdowns and, from what I can tell, in Streaming most Heap memory was in the young “Eden” scape, while on develop most Heap memory was in the older “Old Gen/Survivor” scape (preference seemed to depend on whether I was profiling memory)
  • For non-heap memory, streaming appeared to have a neglible to slight negative impact--although we’re talking within a couple MBs. Of note is that Develop has an increase of non-heap memory at the end, while Streaming is consistent all the way through--at least when doing memory profiling. When not profiling, while Streaming starts with a higher non-heap memory, at about halfway through the usage is just under halved, while develop’s usage remains approximately constant. I don’t know what the subcategories for non-heap memory mean, so I’ll skip an analysis.
  • Streaming took slightly longer than develop - 27min vs 25min with profiling on, 2min v 1min with profiling off. This may be avoidable by having streaming work in its own thread.

I also ran a JDBC profiler alongside the memory profiler and of note:

  • the threshold for streaming may need to be tweaked, as Streaming spent 50 seconds total awaiting Network I/O calls when inserting profile segments (reminder that this simulation took 27 minutes total), although the average individual wait was 150 microseconds. Similarly, it spent 18 seconds total on the segments being actively written, with the average execution of 50 microseconds
    • Taking a quick look at the code, I accidentally enabled the “returned generated columns” flag on the “insert segments” statement, which may be the reason the Network I/O was so high for that statement, since that flags asks the JDBC to request excess data back from the DB
  • We may want to take a quick peek at simulation_extent, as an abnormally long amount of time was spent updating this table--8 seconds overall on develop, 3 seconds on streaming, with the average execution time for the statement being 11ms on develop, 3ms on streaming.

Raw Snapshots for those curious: Streaming.zip

Documentation

No docs need to be updated.

Future work

Investigate simulation_extent

@Mythicaeda Mythicaeda added refactor A code change that neither fixes a bug nor adds a feature feature A new feature or feature request simulation Anything related to the simulation domain labels Jul 18, 2024
@Mythicaeda Mythicaeda self-assigned this Jul 18, 2024
@Mythicaeda Mythicaeda requested a review from a team as a code owner July 18, 2024 17:33
@Mythicaeda Mythicaeda force-pushed the feat/stream-sim-results branch from 3217388 to fdecca9 Compare July 18, 2024 22:25
@Mythicaeda
Copy link
Contributor Author

Just pushed changes addressing the set of comments left during the walkthrough.

Copy link
Contributor

@adrienmaillard adrienmaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No big concern, just a few comments.

Streaming resources requires a third connection to the DB
- TimeTrackerDaemon: make counter "final"
- CheckpointSimFacadeTest: Fix incorrect assertEquals
- InMemoryCachedEngineTest: Use ResourceManager, reflect changes to `SimulationEngine` constructor
SimulationDuplicationTest: Use ResourceManager
- FooSimulationDuplicationTest: Remove empty consumer
Copy link
Contributor

@adrienmaillard adrienmaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work \o/

@Mythicaeda Mythicaeda merged commit 8d29fd9 into develop Jul 24, 2024
6 checks passed
@Mythicaeda Mythicaeda deleted the feat/stream-sim-results branch July 24, 2024 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature or feature request refactor A code change that neither fixes a bug nor adds a feature simulation Anything related to the simulation domain
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stream simulation results into the database
3 participants