Implement compressed Namespace #2856

jackkoenig · 2022-11-20T06:20:09Z

The namespace disambiguates requests for the same name with _. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring.

For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in _<idx>.

In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast.

To benchmark this, I took a large internal design a picked a module that had a large Namespace. I then dumped every requested name to a file for that module and used that set of 243126 names to measure how much memory the version on master uses vs. this version.

I found that this change reduces the amount of time to name those 243126 names from 66ms to 34ms (about 2x) and reduces the memory used from 80 MiB to a mere 1.2 MiB (a staggering 66x reduction). Of course these results depend on how many name collisions there are, and the fewer there are the smaller the savings, but I believe this module to be fairly representative.

Namespace operations are not a huge part of overall Chisel elaboration, so it's a bit hard to measure the benefit for full elaboration, but I suspect it reduces memory use and pressure by a few percent, maybe 3-5%. I'm trying to get some good measurements here.

Contributor Checklist

Did you add Scaladoc to every public function/method?
Did you add at least one test demonstrating the PR?
Did you delete any extraneous printlns/debugging code?
Did you specify the type of improvement?
Did you add appropriate documentation in docs/src?
Did you state the API impact?
Did you specify the code generation impact?
Did you request a desired merge strategy?
Did you add text to be included in the Release Notes for this change?

Type of Improvement

performance improvement

API Impact

No impact

Backend Code Generation Impact

~~This does perturb naming of a handful of signals (~200 in a design that has millions). I am not too worried but can spend more time understanding the issue if anyone finds this concerning.~~

Nevermind I figured it out. It was the case of _0 being a "false collision" because the namespace starts disambiguating at _1. There's extra logic now to preserve the old behavior for _0 that just results in 1 additional boolean check in an uncommon (but hit at least 200 times!) code path.

This has no impact on the generated FIRRTL/Verilog.

Desired Merge Strategy

Squash

Release Notes

Reduce the memory use of internal Namespace datastructure.

Reviewer Checklist (only modified by reviewer)

Did you add the appropriate labels?
Did you mark the proper milestone (Bug fix: 3.4.x, [small] API extension: 3.5.x, API modification or big change: 3.6.0)?
Did you review?
Did you check whether all relevant Contributor checkboxes have been checked?
Did you do one of the following when ready to merge:
- Squash: You/ the contributor Enable auto-merge (squash), clean up the commit message, and label with Please Merge.
- Merge: Ensure that contributor has cleaned up their commit history, then merge with Create a merge commit.

aswaterman · 2022-11-20T07:09:52Z

Nice space reduction!

core/src/main/scala/chisel3/internal/Builder.scala

The namespace disambiguates requests for the same name with _<idx>. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring. For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in "_<idx>". In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast.

jackkoenig · 2022-11-29T02:13:53Z

I finally got some measurements and this decreases peak memory use for a fairly large design by about ~3%. Not huge, but still nice.

The namespace disambiguates requests for the same name with _<idx>. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring. For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in "_<idx>". In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast. (cherry picked from commit 1654d87) # Conflicts: # core/src/main/scala/chisel3/internal/Builder.scala

* Implement compressed Namespace (#2856) The namespace disambiguates requests for the same name with _<idx>. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring. For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in "_<idx>". In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast. (cherry picked from commit 1654d87) # Conflicts: # core/src/main/scala/chisel3/internal/Builder.scala * Resolve backport conflicts Co-authored-by: Jack Koenig <[email protected]>

jackkoenig added this to the 3.5.x milestone Nov 20, 2022

jackkoenig force-pushed the compressed-namespace branch from 65848bf to fb716f8 Compare November 20, 2022 06:37

jackkoenig commented Nov 20, 2022

View reviewed changes

core/src/main/scala/chisel3/internal/Builder.scala Outdated Show resolved Hide resolved

jackkoenig commented Nov 20, 2022

View reviewed changes

core/src/main/scala/chisel3/internal/Builder.scala Outdated Show resolved Hide resolved

jackkoenig force-pushed the compressed-namespace branch from 775a5ff to 36b81f7 Compare November 29, 2022 02:05

jackkoenig force-pushed the compressed-namespace branch from 36b81f7 to da5cda1 Compare November 29, 2022 02:08

jackkoenig requested a review from azidar November 29, 2022 02:14

azidar approved these changes Nov 29, 2022

View reviewed changes

jackkoenig merged commit 1654d87 into master Nov 29, 2022

jackkoenig deleted the compressed-namespace branch November 29, 2022 05:41

mergify bot mentioned this pull request Nov 29, 2022

Implement compressed Namespace (backport #2856) #2860

Merged

mergify bot added the Backported This PR has been backported label Nov 29, 2022

azidar mentioned this pull request Mar 13, 2023

A name followed by the same name with an trailing underscore results in java.lang.NumberFormatException #3084

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement compressed Namespace #2856

Implement compressed Namespace #2856

jackkoenig commented Nov 20, 2022 •

edited

Loading

aswaterman commented Nov 20, 2022

jackkoenig commented Nov 29, 2022

Implement compressed Namespace #2856

Implement compressed Namespace #2856

Conversation

jackkoenig commented Nov 20, 2022 • edited Loading

Contributor Checklist

Type of Improvement

API Impact

Backend Code Generation Impact

Desired Merge Strategy

Release Notes

Reviewer Checklist (only modified by reviewer)

aswaterman commented Nov 20, 2022

jackkoenig commented Nov 29, 2022

jackkoenig commented Nov 20, 2022 •

edited

Loading