-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Openness to significant contributions? #25
Comments
Hello, I'm generally very open to all kinds of contributions, especially when they improve the performance or usability. I do not have any plans to actively implement any new features myself, but I'm happy to discuss and review any new features anyone wants to contribute. 1. Replacement of the String =>
|
Buffer actually is a Uint8Array subclass, so there's no need for the additional wrapper. I also personally have been using UTF-16LE decodes as the overhead of the UTF-8 transformation exceeds the cost of hashing the additional bytes from my measurement, since as long as I'm internally consistent with my encodings there's no correctness problem there. Re: Bigint I'm happy to do the analysis... Given the fact that Bigint is arbitrary precision I was assuming there's be a performance hit vs u32 math.
Yes, it's a separate API, the thing I was trying to describe is that the wasm will now be using a new wasm instruction ( |
I was surprised to see the large-byte-size degradation of the Buffer.from(utf8) mechanism since that's not what the large benchmarks show overall, and indeed that does seem to come down to the encoding (also node 17.3.0):
|
And I'm pleasantly surprised to see that BigInt should do nicely here. I benchmarked toString performance and there's minimal difference between my precomputed version and the BigInt toString call (with BigInt even coming out ahead frequently), so presumably removing the memory operations involved in the other approaches will yield a net win.
I'm happy to do the work to replace the relevant calls with BigInts and can probably swing doing a 32 bit streaming implementation... Though at this point it feels like releasing a major is likely the best approach for all of this, if you've got no concerns about that? |
Ah I see, I've only seen that
Absolutely, that would be great. With the addition of the streaming API alongside these performance improvements, I think it's the perfect opportunity to release |
Great! I certainly don't know of an implementation that doesn't support bulk memory, just wanted to be explicit about changes. One last question and I can get started: My current version uses top level await rather than a promise factory export—I'm assuming for CJS compatibility you'd prefer to keep the promise factory pattern, but figured I should check. |
I don't think I fully understand what you mean by using top-level await in this particular case. I'm assuming you mean that you initialise it immediately in the module, without having to call a function first, is that what you meant? If yes, I'm definitely very apprehensive about that, just because having it initialised when the module is first loaded, makes it unpredictable when it's effectively loaded and impossible to avoid initialising it on certain code paths, where it's not used, without code splitting and dynamic imports, which not only isn't that trivial, it's also very inefficient with such a small library. |
Yep! Sounds good, I'll maintain the current pattern. |
Hi there,
To start, thanks a ton for this project! It's been the starting point for a huge performance improvement in some of my employer's build tooling (which I'm working on open sourcing). As you may recall a while back (#9), I mentioned I'd forked this library to add some streaming xxhash64 functionality with the (timeline-naive) hope of upstreaming those additions last spring.
At this point, our internal fork has accumulated a few significant differences (performance-focused) that I'd also be interested in contributing, but given some of the implications of both the streaming additions and the performance improvements I figured I should reach out and confirm your interest in them...
To start, I'll share the relevant benchmarking runs (my fork is locally named xxhash64-wasm, as I had no need for the xxhash32 implementation):
These improvements are largely due to two changes:
1. Replacement of the String =>
Uint8Array
mechanismWhile this isn't generalizable to the browser,
Buffer.from
significantly outperforms node'sTextEncoder
implementation:Buffer.from (UTF-16LE) x 3,621,691 ops/sec ±0.30% (143 runs sampled)
Buffer.from x 3,063,490 ops/sec ±0.67% (142 runs sampled)
TextEncoder.encode x 1,218,150 ops/sec ±3.42% (118 runs sampled)
This is an important consideration for my usecase, but if you'd prefer to leave this type of choice to the
h64Raw
API that certainly would seem fair to me.2. Replacement of the u64 => Hex String mechanism
It's a bit uglier, but precomputing a byte-to-two-digit-hex-string mapping is substantially faster than the current
DataView
-based approach, and that's particularly important in those small-input benchmarks. There's potentially a question about how this interacts with the API given that the shared codepath in master uses a dataview for the raw API as well as the string, but either way, the improvement proved worthwhile for my use case.There are two other, more minor improvements also contributing to the above results, that I expect to be less controversial:
TypedArray
wrapping theWebAssembly.Memory
instanceLastly, there's the matter of the streaming implementation—one key item there is that the streaming algorithm relies on
memcpy
and to make that performant my fork enables bulk memory operations and utilizes thememory.copy
instruction. Obviously use of that new target would be a significant breaking change for the library, so I wanted to make sure there was interest in changing such things before jumping through the relevant hoops. Implementation-wise, the addition of the streaming API also requires extractingfinalize
into it's own function, but I don't expect that you'll care very much about that. The only other thing to mention about the streaming implementation is that I've only implemented the 64bit algorithm, so we'd be talking about some additional API asymmetry.In terms of the streaming API, it's currently:
Certainly open to feedback there, but it's proven pretty functional to mirror the
crypto
API for my use case.So all that said, please let me know what you think and I'll get to work on a PR (or just publishing my fork wholesale if you're not interested).
Thanks,
Marcus
The text was updated successfully, but these errors were encountered: