-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8255917: runtime/cds/SharedBaseAddress.java failed "assert(reserved_rgn != 0LL) failed: No reserved region" #1657
Conversation
…gn != 0LL) failed: No reserved region"
👋 Welcome back minqi! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Yumin,
small request wrt to commenting.
We should test this with all permutations of NativeMemoryTracking and Xshare. But I guess the standard tests do this already?
Cheers, Thomas
Thanks for the review! There are permutation on standard tests, I will give another look if they cover all the cases again. I think it is enough so no need for a new test case but will check again. Thanks |
… on required address
// the gap reserved at the end of the archive space. | ||
archive_space_rs = total_rs.first_part(ccs_begin_offset, | ||
(size_t)os::vm_allocation_granularity(), | ||
/*split=*/false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since /*split=*/false
is passed, the region is no longer split using os::split_reserved_memory()
. This means on Windows, we cannot use os::release_memory()
to free the individual regions of archive_space_rs
and class_space_rs
.
The Windows API of VirtualFree says:
Releases the specified region of pages, or placeholder [...],
dwSize
must be 0 (zero), andlpAddress
must point to the base address returned by theVirtualAlloc
function when the region is reserved. The function fails if either of these conditions is not met.
I suggest we do this:
- Add an extra
ReservedSpace& total_rs
parameter toMetaspaceShared::reserve_address_space_for_archives()
. Return thetotal_rs
when we go through this path of the code. - Also pass
total_rs
toMetaspaceShared::release_reserved_spaces
. Iftotal_rs.is_reserverd()
is true, releasetotal_rs
instead of the two smaller spaces.
To make sure this PR is correct, we should add something like the following in os::release_memory()
, and check for this log in test/hotspot/jtreg/runtime/cds/SharedBaseAddress.java:
if (!res) {
log_info(os)("os::release_memory(" PTR_FORMAT ", " SIZE_FORMAT ") failed", p2i(addr), bytes);
}
Perhaps, in a separate RFE, we should add an assert in os::release_memory()
, or at least change the log_info
to log_warning
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ioi,
Where is the release you worry about? Since on Windows, since 8255978, os::release_memory() notices if the region the caller wants release does not correspond exactly to a memory mapping at the OS level, and will assert. Do you see that assert?
I try to understand:
if useBaseAddress==true, Yumin now creates two separate mappings, and can release them individually
if useBaseAddress==false, there is one mapping as before, but we split now shallow. But we don't release it since we use file IO to read into it. If someone were to release one of those, we should see an assert on Windows.
(I am a tiny bit unhappy about the increasing complexity of the patch, since it negates some of the work done to simplify it back in June.)
About tracing, since 8256864 we trace Virtualxxx calls, so the tracing is already there - we trace VirtualFree() errrors for "os=info". If you only care for windows, that tracing could suffice.
Thanks, Thomas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If useBaseAddress==false
, after the total_space is successfully mapped, subsequent operations mail fail. For example,
mapinfo->map_regions()
may fail to commit the necessary memory for doingos::read()
.mapinfo->validate_shared_path_table()
may fail because the runtime classpath is not compatible
In these cases, we need to call MetaspaceShared::release_reserved_spaces().
Do you see that assert?
Hmm, I think we should add a new test for this specifically: java -XX:ArchiveRelocationMode=1 -cp mispatched.jar
to force the failure in mapinfo->validate_shared_path_table()
.
If you only care for windows, that tracing could suffice.
Since we are changing the split
parameter for all platforms, I think we should test for all platforms, not just windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new test case can be something like this:
$ java -XX:DumpLoadedClassList=HelloWorld.classlist -cp HelloWorld.jar HelloWorld
Hello World
$ java -Xshare:dump -XX:SharedClassListFile=HelloWorld.classlist -cp HelloWorld.jar \
-XX:SharedArchiveFile=tmp.jsa -XX:SharedBaseAddress=0
$ touch XXX.jar
$ java -Xshare:auto -XX:SharedArchiveFile=tmp.jsa -cp XXX.jar:HelloWorld.jar -Xlog:cds \
-showversion HelloWorld
...
[0.034s][info][cds] Archive(s) were created with -XX:SharedBaseAddress=0. Always map at os-selected address.
[0.034s][info][cds] Try to map archive(s) at an alternative address
[0.034s][info][cds] Mapped static region #0 at base 0x00007f21f7800000 top 0x00007f21f7806000 (MiscCode)
[0.034s][info][cds] Mapped static region #1 at base 0x00007f21f7806000 top 0x00007f21f7a1d000 (ReadWrite)
[0.034s][info][cds] Mapped static region #2 at base 0x00007f21f7a1d000 top 0x00007f21f7daf000 (ReadOnly)
[0.048s][info][cds] UseSharedSpaces: shared class paths mismatch (hint: enable -Xlog:class+path=info to diagnose the failure)
[0.048s][info][cds] Unmapping region #0 at base 0x00007f21f7800000 (MiscCode)
[0.048s][info][cds] Unmapping region #1 at base 0x00007f21f7806000 (ReadWrite)
[0.048s][info][cds] Unmapping region #2 at base 0x00007f21f7a1d000 (ReadOnly)
[0.048s][info][cds] UseSharedSpaces: Unable to map shared spaces
java version "16-internal" 2021-03-16
Java(TM) SE Runtime Environment (slowdebug build 16-internal+0-adhoc.iklam.open)
Java HotSpot(TM) 64-Bit Server VM (slowdebug build 16-internal+0-adhoc.iklam.open, mixed mode)
Hello World
You can set a breakpoint at MetaspaceShared::release_reserved_spaces()
to make sure it's called with total_rs
.
Note that -Xshare:auto
must be used.
If -Xshare:on
is used, the VM will exit immediately without calling MetaspaceShared::release_reserved_spaces()
. Most of the CDS tests are executed with -Xshare:on
. That's why we didn't see the assert on Windows with Yumin's earlier patch -- in Mach5 tier 4, we run test/hotspot/jtreg/runtime/cds/appcds/BootClassPathMismatch.java with -XX:ArchiveRelocationMode=1
, but the test itself would use with -Xshare:on
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When jar mismatched, say, -cp non-exist.jar (We could not use -cp no-exitst.jar:hello.jar since, once hello.jar the dump time jar found and matched, the non-exist.jar will be ignored, test showed it is OK), the output:
[0.035s][info ][cds] Mapped static region #0 at base 0x00007f98e3800000 top 0x00007f98e3806000 (MiscCode)
[0.035s][info ][cds] Mapped static region #1 at base 0x00007f98e3806000 top 0x00007f98e3a1e000 (ReadWrite)
[0.035s][info ][cds] Mapped static region #2 at base 0x00007f98e3a1e000 top 0x00007f98e3db2000 (ReadOnly)
[0.049s][info ][cds] UseSharedSpaces: shared class paths mismatch (hint: enable -Xlog:class+path=info to diagnose the failure)
[0.049s][info ][cds] Unmapping region #0 at base 0x00007f98e3800000 (MiscCode)
[0.049s][info ][cds] Unmapping region #1 at base 0x00007f98e3806000 (ReadWrite)
[0.049s][info ][cds] Unmapping region #2 at base 0x00007f98e3a1e000 (ReadOnly)
[0.049s][debug][cds] Released shared space 0x00007f98e3800000
[0.049s][debug][cds] Released shared space (archive + class) 0x00007f98e3800000
[0.049s][info ][cds] UseSharedSpaces: Unable to map shared spaces
The repeat will be eliminated next version from here:
1489 } else {
1490 unmap_archive(static_mapinfo);
1491 unmap_archive(dynamic_mapinfo);
1492 log_debug(cds)("Released shared space " INTPTR_FORMAT, p2i(total_space_rs.base()));
1493 release_reserved_spaces(total_space_rs, archive_space_rs, class_space_rs);
1494 }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct that if you use -cp non-exist.jar:hello.jar
, the non-exist.jar will be ignored because it doesn't exist, and CDS will not treat the classpath as a mismatch.
However, in my example, I created the XXX.jar file, so CDS will treat the classpath as a mismatch.
Reviewed-by: dholmes
…cided on incompatible initial and minimum heap sizes Reviewed-by: tschatzl, sjohanss
…re complete Reviewed-by: chegar, rriggs
…verification Reviewed-by: thartmann, neliasso
…ng in GET_BODY" Reviewed-by: michaelm
Reviewed-by: tschatzl, stefank
… and not abstract Reviewed-by: mchung, darcy
… a offset-computing method handle Reviewed-by: mcimadamore, chegar
… disabled Reviewed-by: mgronlun
Co-authored-by: Vicente Romero <[email protected]> Co-authored-by: Harold Seigel <[email protected]> Reviewed-by: lfoltan, mchung, alanb, mcimadamore, chegar
Reviewed-by: jiefu, kbarrett
…ngly marked oops Reviewed-by: shade, rkennke
Reviewed-by: goetz
…amond Reviewed-by: mcimadamore
Reviewed-by: redestad
Reviewed-by: erikj, jvernee, burban
Reviewed-by: shade
…args record component Reviewed-by: mcimadamore
…in size computation for heap segments Reviewed-by: jvernee, chegar
Reviewed-by: asemenyuk, almatvee, shade
Please check 03. 02 is generated when merge with most current and remote head not updated correctly. After set remote head correct, 03 is regenerated and is correct one for review. Thanks |
…sed after archive loading failed; Unmap bitmap after archive failure. Fixed reserved region name for adding reserved region.
This branch has many conflicts, something wrong since push-02, closed this PR and will send a single patch in new PR. |
Hi, Please review
Windows mapping for file into memory could not happen to reserved memory. In mapping CDS archive we first reserve enough memory then before mapping, release them. For cds archive and using class space, need split the whole space into two, that is, release the whole reserved space and do reservation to the two split spaces again, which is problematic that there is possibility other thread or system can kick in to take the released space.
The fix is the first step of two steps:
This fix is first step, for Windows and use requested mapping address, reserved for cds archive and ccs on a contiguous space separately, so there is no need to call split. If any reservation failed, release them, go to other way, but do not do the 'real' split either. For Windows (and using class space), the reserved space will be released anyway.
Tests:tier1-5,tier7
Progress
Issue
Download
$ git fetch https://git.openjdk.java.net/jdk pull/1657/head:pull/1657
$ git checkout pull/1657