Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a config field that allows us to mlock() module mmaps into RAM #3820

Closed
wants to merge 8 commits into from

Conversation

pchickey
Copy link
Contributor

Needs benchmarking before this is considered seriously.

@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels Feb 17, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api", "wasmtime:config"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@github-actions
Copy link

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

  • If you added a new Config method, you wrote extensive documentation for
    it.

    Our documentation should be of the following form:

    Short, simple summary sentence.
    
    More details. These details can be multiple paragraphs. There should be
    information about not just the method, but its parameters and results as
    well.
    
    Is this method fallible? If so, when can it return an error?
    
    Can this method panic? If so, when does it panic?
    
    # Example
    
    Optional example here.
    
  • If you added a new Config method, or modified an existing one, you
    ensured that this configuration is exercised by the fuzz targets.

    For example, if you expose a new strategy for allocating the next instance
    slot inside the pooling allocator, you should ensure that at least one of our
    fuzz targets exercises that new strategy.

    Often, all that is required of you is to ensure that there is a knob for this
    configuration option in wasmtime_fuzzing::Config (or one
    of its nested structs).

    Rarely, this may require authoring a new fuzz target to specifically test this
    configuration. See our docs on fuzzing for more details.

  • If you are enabling a configuration option by default, make sure that it
    has been fuzzed for at least two weeks before turning it on by default.


To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With copy-on-write initialization of memories another thing we may wish to measure and/or learn about is the behavior here with respect to new mappings. For example with Module::deserialize_file we mmap the whole file into memory and use that for a Module, but then we'll also mmap the memory image part of the file into new linear memories for each instance. Even if we mlock the original Module into memory we might still run the risk that the kernel page cache doesn't have entries for the module's memory image.

One naive way we might be able to fix that is to mmap the memory image as read-only into some random address in the process, mlock it in place, and then ignore it. Theoretically whenever a future mmap of the memory image happens the kernel could realize that there's a readonly-page mapped into the address space which can be reused as the source of bytes for copy-on-write. I don't know if that's how Linux behaves though as it seems somewhat magical, but otherwise I'm not sure if we can eat the cost of possibly-slow-to-disk page faults on new memory maps at startup and ensure there's no cost in the future.

crates/wasmtime/src/module.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/module/serialization.rs Outdated Show resolved Hide resolved
@tschneidereit
Copy link
Member

Even if we mlock the original Module into memory we might still run the risk that the kernel page cache doesn't have entries for the module's memory image.

@alexcrichton by this, do you mean that the very first time a module is instantiated / a particular mapped page is accessed in an instance, it needs to be mapped in from disk?

If so, that's the intended behavior for this patch: basically, the idea is to not have to read all pages of all modules at startup, with the associated increases in startup time and RSS. Instead, we'd still map pages in lazily, but once we have them, we keep them around for sure.

Eagerly mapping in pages is a separate thing that we might also want to introduce support for, but I think it's best to have these be separate.

@alexcrichton
Copy link
Member

Ah I see. I did indeed mean during instantiation and while I don't think we want to eagerly pay the cost of initializing memory (as that defeats the original purpose of lazy init) there could be a theoretical case made for the first page fault on each page should be not as slow as going to disk. That being said I don't think we can guarantee that because we can't guarantee membership in the kernel's page cache with mlock.

If we still want to lazily page things in, though, then I think this may want to use mlock2 with MLOCK_ONFAULT because currently it uses mlock which I believe guarantees that everything is resident after mlock returns so it would have rss and startup time implications.

@sunfishcode
Copy link
Member

The macOS failure here is because mlock_with / mlock2 is only available on Linux.

@bjorn3
Copy link
Contributor

bjorn3 commented Feb 18, 2022

What would exactly be the benefit of this change? Once a module is instantiated and run, all necessary pages should already be in memory, right? The only thing that can kick them back out again is a low memory condition, in which case mlock would require the kernel to evict other pages even if they are used more frequently.

@tschneidereit
Copy link
Member

@bjorn3 you're right that in most contexts this doesn't make much sense. It can be useful however in settings where the goal is to have close to full memory usage at all times, but with most data being evictable because it's just there for caching purposes. In such a setting, there's often more information available on the application level than on the kernel level about which data is more important to retain, and this is an attempt to make use of that information in a better way.

cfallin added a commit to cfallin/wasmtime that referenced this pull request Feb 18, 2022
In bytecodealliance#3820 we see an issue with the new heuristics that control use of
memfd: it's entirely possible for a reasonable Wasm module produced by a
snapshotting system to have a relatively sparse heap (less than 50%
filled). A system that avoids memfd because of this would have an
undesirable performance reduction on such modules.

Ultimately we should try to implement a hybrid scheme where we support
outlier/leftover initializers, but for now this PR makes the "always
allow dense" limit configurable. This way, embedders that want to ensure
that memfd is used can do so, if they have other knowledge about the
maximum heap size allowed in their system.

(Partially addresses bytecodealliance#3820 but let's leave it open to track the hybrid
idea)
cfallin added a commit to cfallin/wasmtime that referenced this pull request Feb 22, 2022
In bytecodealliance#3820 we see an issue with the new heuristics that control use of
memfd: it's entirely possible for a reasonable Wasm module produced by a
snapshotting system to have a relatively sparse heap (less than 50%
filled). A system that avoids memfd because of this would have an
undesirable performance reduction on such modules.

Ultimately we should try to implement a hybrid scheme where we support
outlier/leftover initializers, but for now this PR makes the "always
allow dense" limit configurable. This way, embedders that want to ensure
that memfd is used can do so, if they have other knowledge about the
maximum heap size allowed in their system.

(Partially addresses bytecodealliance#3820 but let's leave it open to track the hybrid
idea)
cfallin added a commit to cfallin/wasmtime that referenced this pull request Feb 22, 2022
In bytecodealliance#3820 we see an issue with the new heuristics that control use of
memfd: it's entirely possible for a reasonable Wasm module produced by a
snapshotting system to have a relatively sparse heap (less than 50%
filled). A system that avoids memfd because of this would have an
undesirable performance reduction on such modules.

Ultimately we should try to implement a hybrid scheme where we support
outlier/leftover initializers, but for now this PR makes the "always
allow dense" limit configurable. This way, embedders that want to ensure
that memfd is used can do so, if they have other knowledge about the
maximum heap size allowed in their system.

(Partially addresses bytecodealliance#3820 but let's leave it open to track the hybrid
idea)
alexcrichton pushed a commit that referenced this pull request Feb 22, 2022
In #3820 we see an issue with the new heuristics that control use of
memfd: it's entirely possible for a reasonable Wasm module produced by a
snapshotting system to have a relatively sparse heap (less than 50%
filled). A system that avoids memfd because of this would have an
undesirable performance reduction on such modules.

Ultimately we should try to implement a hybrid scheme where we support
outlier/leftover initializers, but for now this PR makes the "always
allow dense" limit configurable. This way, embedders that want to ensure
that memfd is used can do so, if they have other knowledge about the
maximum heap size allowed in their system.

(Partially addresses #3820 but let's leave it open to track the hybrid
idea)
mpardesh pushed a commit to avanhatt/wasmtime that referenced this pull request Mar 17, 2022
…ance#3831)

In bytecodealliance#3820 we see an issue with the new heuristics that control use of
memfd: it's entirely possible for a reasonable Wasm module produced by a
snapshotting system to have a relatively sparse heap (less than 50%
filled). A system that avoids memfd because of this would have an
undesirable performance reduction on such modules.

Ultimately we should try to implement a hybrid scheme where we support
outlier/leftover initializers, but for now this PR makes the "always
allow dense" limit configurable. This way, embedders that want to ensure
that memfd is used can do so, if they have other knowledge about the
maximum heap size allowed in their system.

(Partially addresses bytecodealliance#3820 but let's leave it open to track the hybrid
idea)
Pat Hickey and others added 5 commits March 17, 2022 15:04
which eliminates special cases nicely
Also factor in the mlock hint into the initial image for a module's
memory by forcing memfd to be used instead of reusing the existing
mapping.
@alexcrichton alexcrichton force-pushed the pch/mlock_experiment branch from 561f325 to ba97602 Compare March 17, 2022 22:05
@alexcrichton
Copy link
Member

As a heads up @pchickey I just pushed to this branch. I rebased it and tweaked things a bit, but this is still not in a landable state I think.

@pchickey pchickey closed this Mar 18, 2022
@alexcrichton alexcrichton deleted the pch/mlock_experiment branch April 1, 2022 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants