Inlining of methods #343

yorickpeterse · 2022-09-21T16:11:08Z

Inlining is probably the single most important optimisation pass to apply, as many other optimisations depend on it. Inko's bytecode compiler doesn't perform inlining at the moment, and we should change that.

The first step towards this is to determine what criteria we'll use to inline a method, and what other languages (e.g. Swift and Rust) use. Simply looking at the number of lines of code probably isn't good enough. async methods would never be inlined because we can't, they're used for messages after all.

When inlining methods, care should be taken to "rewrite" whatever self refers to in the inlined code. That is, if we inline A.foo into B.bar, any reference to self in A.foo should point to the appropriate instance of A, not B.

TODO

Resources

https://www.microsoft.com/en-us/research/wp-content/uploads/2002/07/inline.pdf

The text was updated successfully, but these errors were encountered:

yorickpeterse · 2024-04-04T16:17:00Z

To add some context: LLVM doesn't optimize across modules unless you enable link time optimizations. LTO can drastically increase compile times, and doesn't always produce better results. Because of this, if you call an Inko method defined in module A from module B, the call won't get inlined. As we use methods for operators (e.g. Int.+), this means Inko code is much slower than it should be due to the sheer number of function calls.

As a test I hacked together some dirty changes that basically link all LLVM modules together and compile them as a whole, completely ignoring incremental compilation in the process. For https://github.com/jinyus/related_post_gen this results in a 5x improvement of runtime performance when using --opt=aggressive, but of course with an increase in compile times.

My long-term desire is to perform enough inlining and optimizations at the MIR level, such that we can then pass that to LLVM to take care of platform specific optimizations. In such a case it doesn't really matter that LLVM only inlines on a per module basis, as we've already done most of the work (and ideally in parallel as much as possible). This would also benefit any future additional backends, should we ever add those.

yorickpeterse · 2024-08-16T13:55:35Z

For 0.17.0 we'll start by adding support for fn inline, which guarantees the method is inlined at the call site (provided it's called through static dispatch). We can then start using this for important methods such as Int.+ and other operator methods. The syntax will be as follows:

fn inline foo {}
fn inline mut foo {}
fn pub inline mut foo {}

The inline keyword is only available to module methods, static methods, and instance methods. It can't be used for required methods defined in a trait, but it can be used for default methods. It can't be used for async methods as those use a different calling convention and can't be inlined into the callers due to their async nature.

Inlining is done at the MIR level, allowing us to inline across modules. Inlining is performed after type specialization.

When inlining we need to rewrite register IDs. To do so, we take the maximum register ID of the method we're inlining into and note it down as MAX. We then take the registers of the method to inline, and simply append them to the registers list of the method we're inlining into. We then traverse the MIR of the method to inline, and for each register update its ID such that it's the result of original ID + MAX. For instance methods we need to replace register 0 (the receiver) with the receiver register at the call site. Using this approach we shouldn't need any hash maps of sorts to map old to new registers, making this process a bit more efficient.

yorickpeterse · 2024-09-26T19:41:18Z

Note to self: if a method is tagged as fn inline and can't be called through dynamic dispatch (= it's not part of a trait), we should not include it in the final machine code. For fn inline methods that are part of trait implementations we can't do this, because they might be called through dynamic dispatch.

yorickpeterse · 2024-09-27T00:28:24Z

For inlining we need to take care to flush caches accordingly. For example, if module A defines method M and B calls it, then the decision as to whether M should be inlined or not should update the cache of B accordingly. This has to work both ways: if M previously wasn't inlined into B but now it should be inlined, or if it was inlined before but this is no longer desired, the cache for B should be flushed.

Since multiple methods from A might be inlined into B, simply recording B as depending on A isn't enough, because the code generated for B can now change even if the source code for A remains the same.

One option is this:

When inlining M into B, we record the method ID of M into B, resulting in a list of methods inlined into B. When generating object files, we iterate over this list and obtain the full symbol names. We then sort this list and hash it into the object file's hash code. If the list changes, so does the hash code, resulting in the object cache getting flushed.

In addition, we flag B as depending on A such that if the source of A changes but the inline candidates remain the same, and B doesn't otherwise explicitly depend on A, we still flush the cache accordingly.

The compiler is now able to inline calls to static and instance methods. For each method we calculate a rough weight/cost, and calls are inlined into their callers until the maximum weight is reached. Inlining is done bottom-up using Tarjan's strongly connected components algorithm, reducing the amount of duplicate inlining work. Inlining is also done in a deterministic order as to ensure incremental compilation caches can be reused as much as possible. The current inlining threshold is on the conservative end as to not increase compile times and compile-time memory usage too much. Over time we may relax this based on user reports and any extra optimization passes we might add. If a method is annotated with the `inline` keyword, it's _always_ inlined into the caller regardless of it or the caller's weight. This is meant to be used when you want to guarantee a method is inlined, such as for the various operator methods of Int, Float, Bool, etc. Because of this guarantee one should use it sparingly, as to not increase the compile time and executable size too much. This fixes #343. Changelog: added

yorickpeterse added performance Changes related to improving performance and removed type::performance labels Mar 15, 2023

yorickpeterse removed this from the 0.2.5 milestone Mar 15, 2023

yorickpeterse added the feature New things to add to Inko, such as a new standard library module label Nov 16, 2023

yorickpeterse added this to the 0.17.0 milestone Aug 16, 2024

yorickpeterse self-assigned this Aug 16, 2024

yorickpeterse closed this as completed in 489c710 Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inlining of methods #343

Inlining of methods #343

yorickpeterse commented Sep 21, 2022 •

edited

Loading

yorickpeterse commented Apr 4, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Sep 26, 2024

yorickpeterse commented Sep 27, 2024

Inlining of methods #343

Inlining of methods #343

Comments

yorickpeterse commented Sep 21, 2022 • edited Loading

TODO

Resources

yorickpeterse commented Apr 4, 2024

yorickpeterse commented Aug 16, 2024

yorickpeterse commented Sep 26, 2024

yorickpeterse commented Sep 27, 2024

yorickpeterse commented Sep 21, 2022 •

edited

Loading