-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement compressed Namespace #2856
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
65848bf
to
fb716f8
Compare
Nice space reduction! |
jackkoenig
commented
Nov 20, 2022
jackkoenig
commented
Nov 20, 2022
775a5ff
to
36b81f7
Compare
The namespace disambiguates requests for the same name with _<idx>. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring. For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in "_<idx>". In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast.
36b81f7
to
da5cda1
Compare
I finally got some measurements and this decreases peak memory use for a fairly large design by about ~3%. Not huge, but still nice. |
azidar
approved these changes
Nov 29, 2022
mergify bot
pushed a commit
that referenced
this pull request
Nov 29, 2022
The namespace disambiguates requests for the same name with _<idx>. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring. For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in "_<idx>". In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast. (cherry picked from commit 1654d87) # Conflicts: # core/src/main/scala/chisel3/internal/Builder.scala
mergify bot
added a commit
that referenced
this pull request
Nov 29, 2022
* Implement compressed Namespace (#2856) The namespace disambiguates requests for the same name with _<idx>. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring. For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in "_<idx>". In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast. (cherry picked from commit 1654d87) # Conflicts: # core/src/main/scala/chisel3/internal/Builder.scala * Resolve backport conflicts Co-authored-by: Jack Koenig <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The namespace disambiguates requests for the same name with _. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring.
For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in
_<idx>
.In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast.
To benchmark this, I took a large internal design a picked a module that had a large Namespace. I then dumped every requested name to a file for that module and used that set of 243126 names to measure how much memory the version on master uses vs. this version.
I found that this change reduces the amount of time to name those 243126 names from 66ms to 34ms (about 2x) and reduces the memory used from 80 MiB to a mere 1.2 MiB (a staggering 66x reduction). Of course these results depend on how many name collisions there are, and the fewer there are the smaller the savings, but I believe this module to be fairly representative.
Namespace operations are not a huge part of overall Chisel elaboration, so it's a bit hard to measure the benefit for full elaboration, but I suspect it reduces memory use and pressure by a few percent, maybe 3-5%. I'm trying to get some good measurements here.
Contributor Checklist
docs/src
?Type of Improvement
API Impact
No impact
Backend Code Generation Impact
This does perturb naming of a handful of signals (~200
in a design that has millions). I am not too worried but can spend more time understanding the issue if anyone finds this concerning.Nevermind I figured it out. It was the case of
_0
being a "false collision" because the namespace starts disambiguating at_1
. There's extra logic now to preserve the old behavior for_0
that just results in 1 additional boolean check in an uncommon (but hit at least 200 times!) code path.This has no impact on the generated FIRRTL/Verilog.
Desired Merge Strategy
Release Notes
Reduce the memory use of internal Namespace datastructure.
Reviewer Checklist (only modified by reviewer)
3.4.x
, [small] API extension:3.5.x
, API modification or big change:3.6.0
)?Enable auto-merge (squash)
, clean up the commit message, and label withPlease Merge
.Create a merge commit
.