-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace LLVM with Cranelift #674
Comments
One issue I encountered last time, and which still seems present, is that it's not clear how one would produce basic debug information (enough to produce a stacktrace). I asked about this in bytecodealliance/wasmtime#5141, but this didn't yield a simple/clear answer. |
We use structure returns in a few places, mainly to interface with the Rust runtime. It seems Cranelift supports this, though it's not clear to what degree: |
Tail calls are currently not supported: bytecodealliance/wasmtime#1065. This doesn't matter much though as we don't make use of them, and LLVM's support for them is spotty as well. |
LLVM has the |
For stack slots, Cranelift has the following methods:
The |
Digging through the Cranelift code and associated projects, it seems producing debug info is still painful, requiring knowledge/generating of raw DWARF data through the gimli crate. I don't have such knowledge at the moment, so it will likely take a while before we can swap out LLVM with Cranelift. |
Per https://docs.rs/cranelift-codegen/latest/cranelift_codegen/ir/trait.InstBuilder.html#method.atomic_load and related methods, it seems Cranelift enforces sequential ordering for atomics. This could have a rather significant impact as we use atomics in a few places, but we absolutely don't need nor want sequential consistency there. |
https://github.com/bytecodealliance/wasmtime/blob/b583c54fda13b53dea362861125dd1e2ced1381d/cranelift/codegen/src/isa/x64/lower.isle#L3213-L3223 states that at least on x86, atomics are just regular loads and stores (optionally with a fence for stores). That's not too bad, but it would be nice to have precise control over this. |
Cranelift doesn't have an equivalent of |
To define constant values, it seems you have to use You then "import" to the global into a function using |
Calling a function is done as follows:
|
Function imports are done using |
In the LLVM backend, we rely on the mem2reg pass to construct the IR in SSA form. This requires allocating temporaries using In Cranelift it seems this isn't necessary, as one can just call |
Jump tables/switches are done using I believe our |
If they don't you can use |
declare_data_in_data is for referencing a global variable from another global variable. To define one you need to first use |
@bjorn3 Thanks for the details! 😃 |
Per Zulip and the documentation: the I'm not sure how we'd implement that though, because when generating stack loads/stores it's not clear if the slot is still used after that. This would probably require some form of liveness analysis. |
Related to that: We'll manually need to generate the correct signatures for functions that take or return structures. This in turn needs to take into account the structure sizes, as structures <= 8 bytes are passed directly while larger ones use an extra argument. We could simplify things by dropping support for structure returns/arguments, but this requires adjusting the runtime library in a whole bunch of places, which I'd like to avoid. |
I think my biggest concern with Cranelift so far is that it's perhaps a bit too low-level. For example, we need to calculate structure sizes, field offsets, etc ourselves. Because of this, I suspect the amount of code needed in the backend would roughly equal that of our current LLVM backend, as we'd simplify things in one area but complicate them in another. For example, we have a bunch of code to specify the structure sizes and what not such that LLVM can calculate the right offsets. I had initially hoped that most of this could go away with Cranelift, but it turns out we'd have to keep most of it. I think Cranelift would really benefit from a slightly more high-level wrapper that abstracts over the various platform differences and what not, basically LLVM--, but such a library doesn't exist unfortunately. I'm going to experiment with doing LLVM generation in parallel, to see if/to what degree that's even possible with Inkwell. I'll also need to really think hard if Cranelift is worth the trouble in its current state. |
Running LLVM/Inkwell in parallel appears tricky. In spite of some of the types being Unfortunately, due to how Inkwell/LLVM is set up, we can't really work around sharing that particular data as it defines all the structure sizes and what not, and creating those for every thread is a waste. If we do create that data on a per-thread basis, and ignore the code potentially being incorrect for a moment, compile times for the standard library tests go down from 3.3 seconds to 1.15 seconds with 8 threads. That's only 3x faster but with 8x the concurrency, which is disappointing. Note that for this experiment my queue is just |
To add to the above: Inkwell exposes basically everything as immutable types/methods, even though LLVM mutates things under the hoods. This means that even if we were to get things to work in parallel, it may cause problems in the future if LLVM starts to mutate/share data where this isn't expected. |
Regarding stack slots: based on how we use structure returns right now, and the fact that Inko values are heap allocated (apart from Int/Float/etc), I think stack slot reuse (or more precisely a lack thereof) isn't a big deal for the time being. |
Another challenge: our FFI supports variadic function arguments, something that's needed to interface with certain C libraries (e.g. cURL). Per this issue Cranelift has no support for this whatsoever, and per this comment you basically have to emulate it. I'm starting to lean more towards sticking with LLVM for the time being, and focusing more on making it parallel and incremental where possible, seeing as transitioning to Cranelift is likely going to require a lot of work. |
I'm going to close this as there are no plans to do this in the next several years at least, in addition to Cranelift lacking various features that we need (as discussed in this thread). |
Description
We currently use LLVM, mainly because at the time of switching to a native code compiler Cranelift wasn't quite ready yet. I think since then it released the features we need (e.g. checked integer arithmetic). Switching would significantly simplify distribution and improve compile times. This issue is meant to keep track of what to keep in mind, what we'd have to change, etc.
It's important to keep in mind that we're not going to support both LLVM and Cranelift. Instead, the long-term goal is to write optimizations ourselves and use Cranelift/whatever to just turn that into machine code.
Related work
No response
Related issues
Missing Cranelift features
The text was updated successfully, but these errors were encountered: