8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter #22881

TheRealMDoerr · 2024-12-25T15:40:23Z

PPC64 implementation of ead0116. I have implemented a couple of rotate instructions.
The first commit only implements lookup_secondary_supers_table_var and uses it in C2. The second commit makes the changes to use it in the interpreter, runtime and C1.
C1 part is refactored such that the same code as before this patch is generated when UseSecondarySupersTable is disabled. Some stubs are modified to provide one more temp register.

Performance difference can be observed when C2 is disabled (measured on Power10):

-XX:TieredStopAtLevel=1 -XX:-UseSecondarySupersTable:
SecondarySuperCacheHits.test  avgt   15  13.028 ± 0.005  ns/op
SecondarySuperCacheInterContention.test     avgt   15  417.746 ± 19.046  ns/op
SecondarySuperCacheInterContention.test:t1  avgt   15  417.852 ± 17.814  ns/op
SecondarySuperCacheInterContention.test:t2  avgt   15  417.641 ± 23.431  ns/op
SecondarySuperCacheIntraContention.test  avgt   15  340.995 ± 5.620  ns/op

-XX:TieredStopAtLevel=1 -XX:+UseSecondarySupersTable:
SecondarySuperCacheHits.test  avgt   15  14.539 ± 0.002  ns/op
SecondarySuperCacheInterContention.test     avgt   15  25.667 ± 0.576  ns/op
SecondarySuperCacheInterContention.test:t1  avgt   15  25.709 ± 0.655  ns/op
SecondarySuperCacheInterContention.test:t2  avgt   15  25.626 ± 0.820  ns/op
SecondarySuperCacheIntraContention.test  avgt   15  22.466 ± 1.554  ns/op

SecondarySuperCacheHits seems to be slightly slower, but SecondarySuperCacheInterContention and SecondarySuperCacheIntraContention are much faster (when C2 is disabled).

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter (Enhancement - P4)

Reviewers

Amit Kumar (@offamitkumar - Committer) Review applies to f7c1b79c
Richard Reingruber (@reinrich - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22881/head:pull/22881
$ git checkout pull/22881

Update a local copy of the PR:
$ git checkout pull/22881
$ git pull https://git.openjdk.org/jdk.git pull/22881/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22881

View PR using the GUI difftool:
$ git pr show -t 22881

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22881.diff

Using Webrev

Link to Webrev Comment

…terpreter

…le_var.

bridgekeeper · 2024-12-25T15:41:23Z

👋 Welcome back mdoerr! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2024-12-25T15:42:15Z

@TheRealMDoerr This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter

Reviewed-by: rrich, amitkumar

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 303 new commits pushed to the master branch:

5cc690d: 8347994: Add additional diagnostics to macOS failure handler to assist with diagnosing MCast test failures
c00557f: 8345049: Remove the jmx.tabular.data.hash.map compatibility property
8b46db0: 8345045: Remove the jmx.remote.x.buffer.size JMX notification property
119899b: 8345048: Remove the jmx.extend.open.types compatibility property
89bfcb8: 8348308: Make fields of ListSelectionEvent final
17df515: 8348303: Remove repeated 'a' from ListSelectionEvent
337118d: 8348388: Incorrect copyright header in TestFluidAndNonFluid.java
3069e91: 8344969: Remove the jmx.mxbean.multiname compatibility property
c882160: 8344966: Remove the allowNonPublic MBean compatibility property
6032f6e: 8341696: C2: Non-fluid StringBuilder pattern bails out in OptoStringConcat
... and 293 more: https://git.openjdk.org/jdk/compare/62a4544bb76aa339a8129f81d2527405a1b1e7e3...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk · 2024-12-25T15:42:44Z

@TheRealMDoerr The following label will be automatically applied to this pull request:

hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2024-12-25T15:46:07Z

Webrevs

offamitkumar

Looks good. Maybe update copyright header year.

TheRealMDoerr · 2025-01-08T11:30:59Z

Thanks for the review!
I haven't found precise rules how to handle Copyright headers. I usually use the year of the PR publication date. Does anybody know other requirements?

offamitkumar · 2025-01-08T16:18:28Z

I have seen header getting updated for some PRs like this one: #22246

So I think we are expected to update, I haven’t seen any such rule though?

TheRealMDoerr · 2025-01-08T16:21:13Z

I have seen header getting updated for some PRs like this one: #22246

So I think we are expected to update, I haven’t seen any such rule though?

Some people use a script, but it's unclear if it does the right thing: #22890 (comment)

TheRealMDoerr · 2025-01-10T23:05:29Z

sh make/scripts/update_copyright_year.sh says "No files were changed". All changes in this PR were done in 2024, so Copyright year changes are only needed for files which are changed in 2025.

reinrich · 2025-01-21T14:33:15Z

src/hotspot/cpu/ppc/c1_Runtime1_ppc.cpp

-        __ check_klass_subtype_slow_path(sub_klass, super_klass, temp1_reg, temp2_reg); // returns with CR0.eq if successful
-        __ crandc(CCR0, Assembler::equal, CCR0, Assembler::equal); // failed: CR0.ne
+                       temp1_reg = R6;
+        __ check_klass_subtype_slow_path(sub_klass, super_klass, temp1_reg, noreg); // may return with CR0.eq if successful


The comment is unclear to me. Where is the result of the subtype check? Can it also return with CR0.ne if successful?
I noticed you added the crandc to check_klass_subtype_slow_path_linear() but if we reach there calling from this location then the crandc is not emitted because L_success == nullptr. Is this ok?
I'd appreciate comments on the masm methods explaining how the result of the subtype check is conveyed.

The correct result is always in CR0 with this PR (unless a label or result GP reg is provided).
"return" means "blr", here. That can optionally be used in case of success. In this case, CR0 is always "eq".
I've moved the crandc instruction into check_klass_subtype_slow_path_linear which contains such a "blr" for a success case. This way, the linear version works exactly as before.
The new code check_klass_subtype_slow_path_table doesn't use "blr". That's why I added "may" to the comment.

This is extremely hard to see.
L2154 with the "blr" in check_klass_subtype_slow_path_linear looks redundant to me. It should be removed if you agree.
The comment here should be adapted then too.
Also the comment at macroAssembler_ppc.cpp:2258 needs to be adapted because fallthrough from check_klass_subtype_slow_path does not mean "not successful". L_failure could be renamed to L_fast_path_failure

I think L_failure is correct. And it's used the same way on all platforms.

Ah, I see. I missed that L_failure is passed by reference

jdk/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp

Lines 2159 to 2168 in c00557f

void MacroAssembler::check_klass_subtype(Register sub_klass,

Register super_klass,

Register temp1_reg,

Register temp2_reg,

Label& L_success) {

Label L_failure;

check_klass_subtype_fast_path(sub_klass, super_klass, temp1_reg, temp2_reg, &L_success, &L_failure);

check_klass_subtype_slow_path(sub_klass, super_klass, temp1_reg, temp2_reg, &L_success);

bind(L_failure); // Fallthru if not successful.

}

Therefore it's never null and L_failure is reached if, and only if the result of the type check is negative.

reinrich · 2025-01-22T12:21:54Z

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp

@@ -2154,6 +2154,96 @@ void MacroAssembler::check_klass_subtype_slow_path(Register sub_klass,
  else if (result_reg == noreg) { blr(); } // return with CR0.eq if neither label nor result reg provided

  bind(fallthru);
+  if (L_success != nullptr && result_reg == noreg) {


Is there a problem if L_success == nullptr && result_reg == noreg and there aren't any secondary supers?
In that case we would reach here with CR0.eq from L2134 and we would fallthrough with CR0.eq. Due to the change in C1StubId::slow_subtype_check_id we would return there with CR0.eq.

This is a reproducer:

public class InstanceOfTest { public static interface TestInterfaceI { } public static class TestClassNegative { } public static void main(String[] args) { Object obj = new TestClassNegative(); for (int i = 100_000; i > 0; i--) { dontinline_testMethod(obj); } boolean result = dontinline_testMethod(obj); System.out.println("result: " + result); } static boolean dontinline_testMethod(Object obj) { return obj instanceof TestInterfaceI; } }

./jdk/bin/java -XX:TieredStopAtLevel=1 -XX:-UseSecondarySupersTable InstanceOfTest result: true

TheRealMDoerr · 2025-01-22T21:53:22Z

Thanks for looking at this! The condition was wrong. I have improved the design of check_klass_subtype_slow_path_linear and removed the early return by "blr". Please take a look at 37789b3.

reinrich · 2025-01-23T13:49:34Z

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp

+    li(result_reg, 1); // load non-zero result (indicates a miss)
+  } else if (L_success == nullptr) {
+    crandc(CCR0, Assembler::equal, CCR0, Assembler::equal); // miss indicated by CR0.ne
+  }
  b(fallthru);

  bind(hit);
  std(super_klass, target_offset, sub_klass); // save result to cache
  if (result_reg != noreg) { li(result_reg, 0); } // load zero result (indicates a hit)
  if (L_success != nullptr) { b(*L_success); }


Handling L_success != nullptr should be put on the else-branch of the previous if-statement.

Doesn't make a real difference, but I've cleaned it up.

Thanks. It matches the assertion you've added. I like consistency. It helps understanding stuff.

TheRealMDoerr · 2025-01-23T14:07:55Z

I've run most of the tier 1 tests with JTREG="VM_OPTIONS=-XX:-UseSecondarySupersTable" and didn't see new failures. I'll rerun tests. Note that Oracle Copyright years are already updated in head, but I don't want to merge because the PPC64le build is currently broken.

reinrich

Thanks for doing the port 👍
Cheers, Richard.

TheRealMDoerr · 2025-01-23T14:31:35Z

Thanks for the review and for finding the bug!

TheRealMDoerr · 2025-01-24T09:48:53Z

Tier 1-4 have passed with and without UseSecondarySupersTable on both, linux ppc64le and AIX.
/integrate

openjdk · 2025-01-24T09:50:08Z

Going to push as commit 4a375e5.
Since your change was applied there have been 320 commits pushed to the master branch:

0df9dcb: 8346572: Check is_reserved() before using ReservedSpace instances
a09f06d: 8348265: RMIConnectionImpl: Remove Subject.callAs on MarshalledObject
0395593: 8346751: Internal java compiler error with type annotations in constants expression in constant fields
2daafe4: 8348283: java.lang.classfile.components.snippets.PackageSnippets shipped in java.base.jmod
50ca450: 8340784: Remove PassFailJFrame constructor with screenshots
416d469: 8347008: beancontext package spec does not clearly explain why the API is deprecated
471d63c: 8343609: Broken links in java.xml
7f16a08: 8348240: Remove SystemDictionaryShared::lookup_super_for_unregistered_class()
48ece07: 8282862: AwtWindow::SetIconData leaks old icon handles if an exception is detected
356e2a8: 8348406: Remove tests GrantAllPermToExtWhenNoPolicy and PrincipalExpansionError from problem list
... and 310 more: https://git.openjdk.org/jdk/compare/62a4544bb76aa339a8129f81d2527405a1b1e7e3...master

Your commit was automatically rebased without conflicts.

openjdk · 2025-01-24T09:50:26Z

@TheRealMDoerr Pushed as commit 4a375e5.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

TheRealMDoerr added 2 commits December 23, 2024 22:07

8344232: [PPC64] secondary_super_cache does not scale well: C1 and in…

aff644d

…terpreter

Change interpreter, runtime and C1 to use lookup_secondary_supers_tab…

f7c1b79

…le_var.

openjdk bot added the rfr Pull request is ready for review label Dec 25, 2024

openjdk bot added the hotspot [email protected] label Dec 25, 2024

offamitkumar approved these changes Jan 7, 2025

View reviewed changes

reinrich reviewed Jan 21, 2025

View reviewed changes

reinrich reviewed Jan 22, 2025

View reviewed changes

Remove early return from check_klass_subtype_slow_path.

37789b3

reinrich reviewed Jan 23, 2025

View reviewed changes

Minor cleanup of check_klass_subtype_slow_path_linear.

db3d6de

reinrich approved these changes Jan 23, 2025

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Jan 23, 2025

openjdk bot added the integrated Pull request has been integrated label Jan 24, 2025

openjdk bot closed this Jan 24, 2025

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 24, 2025

TheRealMDoerr deleted the 8344232_PPC64_secondary_super_cache branch January 24, 2025 09:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter #22881

8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter #22881

TheRealMDoerr commented Dec 25, 2024 •

edited by openjdk bot

Loading

bridgekeeper bot commented Dec 25, 2024

openjdk bot commented Dec 25, 2024 •

edited

Loading

openjdk bot commented Dec 25, 2024

mlbridge bot commented Dec 25, 2024 •

edited

Loading

offamitkumar left a comment

TheRealMDoerr commented Jan 8, 2025

offamitkumar commented Jan 8, 2025

TheRealMDoerr commented Jan 8, 2025

TheRealMDoerr commented Jan 10, 2025

reinrich Jan 21, 2025

TheRealMDoerr Jan 21, 2025 •

edited

Loading

reinrich Jan 22, 2025

TheRealMDoerr Jan 22, 2025

reinrich Jan 23, 2025

reinrich Jan 22, 2025

reinrich Jan 22, 2025

TheRealMDoerr commented Jan 22, 2025

reinrich Jan 23, 2025

TheRealMDoerr Jan 23, 2025

reinrich Jan 23, 2025

TheRealMDoerr commented Jan 23, 2025

reinrich left a comment

TheRealMDoerr commented Jan 23, 2025

TheRealMDoerr commented Jan 24, 2025

openjdk bot commented Jan 24, 2025

openjdk bot commented Jan 24, 2025

	void MacroAssembler::check_klass_subtype(Register sub_klass,
	Register super_klass,
	Register temp1_reg,
	Register temp2_reg,
	Label& L_success) {
	Label L_failure;
	check_klass_subtype_fast_path(sub_klass, super_klass, temp1_reg, temp2_reg, &L_success, &L_failure);
	check_klass_subtype_slow_path(sub_klass, super_klass, temp1_reg, temp2_reg, &L_success);
	bind(L_failure); // Fallthru if not successful.
	}

8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter #22881

8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter #22881

Conversation

TheRealMDoerr commented Dec 25, 2024 • edited by openjdk bot Loading

Progress

Issue

Reviewers

Reviewing

bridgekeeper bot commented Dec 25, 2024

openjdk bot commented Dec 25, 2024 • edited Loading

openjdk bot commented Dec 25, 2024

mlbridge bot commented Dec 25, 2024 • edited Loading

Webrevs

offamitkumar left a comment

Choose a reason for hiding this comment

TheRealMDoerr commented Jan 8, 2025

offamitkumar commented Jan 8, 2025

TheRealMDoerr commented Jan 8, 2025

TheRealMDoerr commented Jan 10, 2025

Choose a reason for hiding this comment

TheRealMDoerr Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheRealMDoerr commented Jan 22, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheRealMDoerr commented Jan 23, 2025

reinrich left a comment

Choose a reason for hiding this comment

TheRealMDoerr commented Jan 23, 2025

TheRealMDoerr commented Jan 24, 2025

openjdk bot commented Jan 24, 2025

openjdk bot commented Jan 24, 2025

TheRealMDoerr commented Dec 25, 2024 •

edited by openjdk bot

Loading

openjdk bot commented Dec 25, 2024 •

edited

Loading

mlbridge bot commented Dec 25, 2024 •

edited

Loading

TheRealMDoerr Jan 21, 2025 •

edited

Loading