-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAssembly SIMD review #487
Comments
If a platform doesn't natively support the corresponding SIMD operations, we WebAssembly falls back to emulation, right? So with respect to CPU fingerprinting, you can probably figure out whether a system supports the corresponding SIMD operations by timing attacks, but that's it? How much do CPUs vary in terms of support for these operations? Just want to know how many bits of CPU information we're exposing here. |
If a platform doesn't have native support for SIMD operations, the engines can either choose to handle this with a Scalar fallback, or applications will need to deploy a WebAssembly binary that can be used as a fallback. Picking between two binaries depending on the feature being supported is a common feature detection model for WebAssembly. The baseline assumption for the current SIMD proposal is that the operations have reasonable mappings to hardware instructions on most modern hardware. In this case, all Intel hardware that has SSE4.1+, as well as most modern ARM, MIPS LE hardware. With respect to CPU fingerprinting, it is possible to figure out whether a system supports the corresponding SIMD operations, but given that the current set of operations are a basic set of vector operations available on most modern hardware, this information may not be particularly useful. One of the goals for the proposal is also that the performance of these operations should be portable across architectures i.e. a similar class of applications should see a performance boost on all supported architectures. That said with a select set of operations, and careful perf measurements, it may be possible to distinguish between Intel/ARM hardware, but this will also depend to a large extent on engine implementations (level of optimization, code generation support for advanced extensions like AVX on Intel etc.). |
Had a meeting with @arunetm where we discussed the TAG review, and how we review and what we expect. We really would like a better explainer, geared toward landing the feature as part of the broader Web Paltform. The current one is really technical and written for the WASM CG and differs somewhat from what we are expecting. Please read our explainer about explainers: https://w3ctag.github.io/explainers For instance we would like a section about Considered Alternatives, which should list the SIMD.js work and why that was abandoned. We would also like information about whether it would be possible to bring SIMD to JavaScript in the future so that WASM and JS don't diverge too much. Or whether that might not make sense. Are there any current outstanding issues or disagreement that we should know about? Is the proposal playing favors to some architectures or making sure that design is done in a way that different architectures can gain similar performance with optimized pipelines? What is the plan for > 128 SIMD? |
Thanks for the feedback.
Evaluating the criteria linked above, and your reply here the area that needs to be explicitly addressed is "the alternatives which have already been considered and why they were not chosen;", and code examples. Code examples from C/C++/Rust were not originally included in the overview as the bytecode should be language agnostic, and as there is currently no JS API, code examples where this could be used directly from JS are not available. Will work with @arunetm on an overview that links to the current technical document, but I would also briefly like to address some of the questions below.
The WebAssembly SIMD work is a direct offshoot of the SIMD.js work which is no longer in active development at TC39. The SIMD.js proposal is inactive for a few different reasons -
A lot of this is actually offset by introducing this at a lower level in WebAssembly - with the current proposal we've been able to demonstrate consistent performance gains across multiple architectures on real world applications. There are no plans currently to expose this to JS, as the issues that existed when SIMD.js was marked inactive still exist today. That said, I don't see the current proposal as fully divergent from JS, as existing JS applications can indirectly use SIMD values in ArrayBuffers, and use Wasm function calls to manipulate SIMD values as long as the types themselves are not exposed to JS.
Not necessarily a disagreement, but there are aspects of this proposal that conflict with the base assumption that WebAssembly is always deterministic. By this I mean that SIMD in hardware can be extremely performant, but can also exhibit non-determinism, which is something this proposal tries to avoid explicitly to be consistent with WebAssembly in general, and to avoid platform specific behavior to avoid finger printing (as brought up in a previous comment on this issue as well). A n example of this is the Wasm SIMD floating point min/max instructions - the hardware instructions for these are not IEEE 754 compliance, and Wasm MVP is specified with strict IEEE 754 compliance. Discussions are still in progress, but this is an example of the tradeoffs that are sometimes required.
The primary goals for the WebAssembly SIMD proposal is usability for real world applications, and consistent performance on benchmarks representative of real world usage - this is emphasized to avoid some of the pitfalls of SIMD.js. The WebAssembly SIMD proposal draws heavily from feedback provided by application developers that are experimenting with, and using the current proposal. The proposal does not play favorites, and where there are cases of suboptimal codegen for particular operations, it evaluates based on user feedback, and actively solicits concrete alternative semantics where available. Unfortunately due to the nature of the hardware support, alternative semantics are not always available - and when they are not, emulating these operations is usually more expensive in both execution time, and code size.
As the current fixed width proposal is still in the experimental phase, and implementations are gaining traction, there hasn't been a significant amount of work that has gone into designing a future version of this proposal. That said, there is some preliminary work in this area that will hopefully gain more traction as the MVP is stabilized. More information about active work in this area can be found here. |
I believe that @arunetm has a more Web Platform focused explainer ready to share soon. Arun? |
Thanks Kenneth. Yes, we have an updated explainer in the works developed with @dtig, and will share it soon. |
Please find an updated explainer with web platform focus here: https://github.com/WebAssembly/simd/blob/master/proposals/simd/W3CTAG-SIMDExplainer.md |
We talked about this in our plenary call today, and we think this is ready to close. Thanks for brining this to us! Please file a followup review request if your design significantly changes. Thanks! |
Hello TAG!
I'm requesting a TAG review of WebAssembly SIMD.
https://www.chromestatus.com/feature/6533147810332672
Further details:
You should also know that...
This is purely a WebAssembly performance feature that does not affect web API behavior, but is still useful for developers to be aware of as it can change performance characteristics of applications using WebAssembly. It adds a new 128-bit value type that is not exposed to JavaScript and several new opcodes for vector operations that are documented here.
We'd prefer the TAG provide feedback as:
🐛 open issues in our GitHub repo for each point of feedback
The text was updated successfully, but these errors were encountered: