-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Record._elements to not duplicate VectorMap if possible #4254
Conversation
val originalElements = elements | ||
// Don't create a new map unless necessary, this is much more memory efficient in common case | ||
// This is true if elements is not a VectorMap or if any names need sanitization | ||
var needNewMap = !originalElements.isInstanceOf[VectorMap[_, _]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why this has to be done with a var
and also why we aren't just doing it something like:
def makeNewMap(newNames: Seq[String) = newNames.view
.zip(originalElements)
.map { case (name, (_, data)) => name -> data }
.to(VectorMap) // VectorMap has O(1) lookup whereas ListMap is O(n)
val newElements = originalElements match {
case original: VectorMap[_, _] => {
val newNames = ...
val allNamesUnchanged = newNames == origNames
if (allNamesUnchanged) original else makeNewMap(newNames)
case _ => makeNewMap(newNames)
}
...? I guess you are trying to only iterate over it once ...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you are trying to only iterate over it once ...?
Correct, I'm trying to minimize the total number of traversals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the changes make sense
(cherry picked from commit 0df3ec9)
…) (#4265) (cherry picked from commit 0df3ec9) Co-authored-by: Jack Koenig <[email protected]> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
The code is a little janky, but it is a massive reduction in memory use.
For a typical Bundle with 7 UInt fields, this reduces the memory footprint from 1992 bytes to 1552 bytes (the duplicate VectorMap of only 7 elements is 440 bytes!), a 22% reduction. Combined with my other recent performance improvement PRs (#4251, #4252, #4253 but excluding #4242), this 1992 bytes is reduced to 1424 for a total reduction of 28.5% memory use for this typical Bundle with 7 UInt fields.
I suspect we can do better by getting rid of any Map here at all and instead just use an
Array[Data]
(where the String names of the fields are stored in the children Data themselves as they are already stored there). Whereas the VectorMap of 7 elements is 440 bytes, an Array of 7 elements is 48 bytes.That being said, this is a much simpler change that imparts a huge benefit so is worth it in the meantime.
Contributor Checklist
docs/src
?Type of Improvement
Desired Merge Strategy
Release Notes
This reduces memory use of a typical bundle by 20%.
Reviewer Checklist (only modified by reviewer)
3.6.x
,5.x
, or6.x
depending on impact, API modification or big change:7.0
)?Enable auto-merge (squash)
, clean up the commit message, and label withPlease Merge
.Create a merge commit
.