Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement compressed Namespace #2856

Merged
merged 1 commit into from
Nov 29, 2022
Merged

Implement compressed Namespace #2856

merged 1 commit into from
Nov 29, 2022

Conversation

jackkoenig
Copy link
Contributor

@jackkoenig jackkoenig commented Nov 20, 2022

The namespace disambiguates requests for the same name with _. Rather than storing every disambiguated name in the underlying HashMap, it now only stores the base along with the "next available" index. This makes the logic for checking if a name is already contained in the namespace slightly more sophisticated because users can name things in a way that will collide with disambiguated names from a common substring.

For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo" takes the name "foo_1" so the following "foo_1" gets disambiguated to "foo_1_1". But since we compressed that original "foo_1" into the same HashMap entry as just "foo", we have to do a form of "prefix checking" whenever naming something that ends in _<idx>.

In practice, the saved memory allocations more than make up for the more complicated logic to disambiguate names because the common case is still fast.

To benchmark this, I took a large internal design a picked a module that had a large Namespace. I then dumped every requested name to a file for that module and used that set of 243126 names to measure how much memory the version on master uses vs. this version.

I found that this change reduces the amount of time to name those 243126 names from 66ms to 34ms (about 2x) and reduces the memory used from 80 MiB to a mere 1.2 MiB (a staggering 66x reduction). Of course these results depend on how many name collisions there are, and the fewer there are the smaller the savings, but I believe this module to be fairly representative.

Namespace operations are not a huge part of overall Chisel elaboration, so it's a bit hard to measure the benefit for full elaboration, but I suspect it reduces memory use and pressure by a few percent, maybe 3-5%. I'm trying to get some good measurements here.

Contributor Checklist

  • Did you add Scaladoc to every public function/method?
  • Did you add at least one test demonstrating the PR?
  • Did you delete any extraneous printlns/debugging code?
  • Did you specify the type of improvement?
  • Did you add appropriate documentation in docs/src?
  • Did you state the API impact?
  • Did you specify the code generation impact?
  • Did you request a desired merge strategy?
  • Did you add text to be included in the Release Notes for this change?

Type of Improvement

  • performance improvement

API Impact

No impact

Backend Code Generation Impact

This does perturb naming of a handful of signals (~200 in a design that has millions). I am not too worried but can spend more time understanding the issue if anyone finds this concerning.

Nevermind I figured it out. It was the case of _0 being a "false collision" because the namespace starts disambiguating at _1. There's extra logic now to preserve the old behavior for _0 that just results in 1 additional boolean check in an uncommon (but hit at least 200 times!) code path.

This has no impact on the generated FIRRTL/Verilog.

Desired Merge Strategy

  • Squash

Release Notes

Reduce the memory use of internal Namespace datastructure.

Reviewer Checklist (only modified by reviewer)

  • Did you add the appropriate labels?
  • Did you mark the proper milestone (Bug fix: 3.4.x, [small] API extension: 3.5.x, API modification or big change: 3.6.0)?
  • Did you review?
  • Did you check whether all relevant Contributor checkboxes have been checked?
  • Did you do one of the following when ready to merge:
    • Squash: You/ the contributor Enable auto-merge (squash), clean up the commit message, and label with Please Merge.
    • Merge: Ensure that contributor has cleaned up their commit history, then merge with Create a merge commit.

@jackkoenig jackkoenig added this to the 3.5.x milestone Nov 20, 2022
@aswaterman
Copy link
Member

Nice space reduction!

The namespace disambiguates requests for the same name with _<idx>.
Rather than storing every disambiguated name in the underlying HashMap,
it now only stores the base along with the "next available" index. This
makes the logic for checking if a name is already contained in the
namespace slightly more sophisticated because users can name things in a
way that will collide with disambiguated names from a common substring.

For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo"
takes the name "foo_1" so the following "foo_1" gets disambiguated to
"foo_1_1". But since we compressed that original "foo_1" into the same
HashMap entry as just "foo", we have to do a form of "prefix checking"
whenever naming something that ends in "_<idx>".

In practice, the saved memory allocations more than make up for the more
complicated logic to disambiguate names because the common case is still
fast.
@jackkoenig
Copy link
Contributor Author

I finally got some measurements and this decreases peak memory use for a fairly large design by about ~3%. Not huge, but still nice.

@jackkoenig jackkoenig requested a review from azidar November 29, 2022 02:14
@jackkoenig jackkoenig merged commit 1654d87 into master Nov 29, 2022
@jackkoenig jackkoenig deleted the compressed-namespace branch November 29, 2022 05:41
mergify bot pushed a commit that referenced this pull request Nov 29, 2022
The namespace disambiguates requests for the same name with _<idx>.
Rather than storing every disambiguated name in the underlying HashMap,
it now only stores the base along with the "next available" index. This
makes the logic for checking if a name is already contained in the
namespace slightly more sophisticated because users can name things in a
way that will collide with disambiguated names from a common substring.

For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo"
takes the name "foo_1" so the following "foo_1" gets disambiguated to
"foo_1_1". But since we compressed that original "foo_1" into the same
HashMap entry as just "foo", we have to do a form of "prefix checking"
whenever naming something that ends in "_<idx>".

In practice, the saved memory allocations more than make up for the more
complicated logic to disambiguate names because the common case is still
fast.

(cherry picked from commit 1654d87)

# Conflicts:
#	core/src/main/scala/chisel3/internal/Builder.scala
@mergify mergify bot added the Backported This PR has been backported label Nov 29, 2022
mergify bot added a commit that referenced this pull request Nov 29, 2022
* Implement compressed Namespace (#2856)

The namespace disambiguates requests for the same name with _<idx>.
Rather than storing every disambiguated name in the underlying HashMap,
it now only stores the base along with the "next available" index. This
makes the logic for checking if a name is already contained in the
namespace slightly more sophisticated because users can name things in a
way that will collide with disambiguated names from a common substring.

For example, in naming the sequence "foo", "foo", "foo_1", the 2nd "foo"
takes the name "foo_1" so the following "foo_1" gets disambiguated to
"foo_1_1". But since we compressed that original "foo_1" into the same
HashMap entry as just "foo", we have to do a form of "prefix checking"
whenever naming something that ends in "_<idx>".

In practice, the saved memory allocations more than make up for the more
complicated logic to disambiguate names because the common case is still
fast.

(cherry picked from commit 1654d87)

# Conflicts:
#	core/src/main/scala/chisel3/internal/Builder.scala

* Resolve backport conflicts

Co-authored-by: Jack Koenig <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backported This PR has been backported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants