Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inlining of methods #343

Closed
16 tasks done
yorickpeterse opened this issue Sep 21, 2022 · 4 comments
Closed
16 tasks done

Inlining of methods #343

yorickpeterse opened this issue Sep 21, 2022 · 4 comments
Assignees
Labels
compiler Changes related to the compiler feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance
Milestone

Comments

@yorickpeterse
Copy link
Collaborator

yorickpeterse commented Sep 21, 2022

Inlining is probably the single most important optimisation pass to apply, as many other optimisations depend on it. Inko's bytecode compiler doesn't perform inlining at the moment, and we should change that.

The first step towards this is to determine what criteria we'll use to inline a method, and what other languages (e.g. Swift and Rust) use. Simply looking at the number of lines of code probably isn't good enough. async methods would never be inlined because we can't, they're used for messages after all.

When inlining methods, care should be taken to "rewrite" whatever self refers to in the inlined code. That is, if we inline A.foo into B.bar, any reference to self in A.foo should point to the appropriate instance of A, not B.

TODO

  • Initial implementation for static methods
  • Support instance methods
  • Flush incremental caches for modules that contain inlined code if the source module changes, even if the module isn't explicitly imported
  • Figure out how to write tests for this, taking into account that MIR isn't a stable IR and might randomly change
  • Adjust the method type signature generator to include the inline keyword if the method should always be inlined
  • Gracefully handle inlining of recursive methods
  • Disable inlining when using --opt=none
  • Add a pass (or passes) that simplify the CFG after running MIR, such that we don't feed LLVM a gigantic pile of IR
  • Remove inline methods that can't be called through dynamic dispatch: Inlining of methods #343 (comment)
  • Generate correct debug information for inlined method calls
  • Make sure std.debug.stacktrace is used correctly in the face of inlined code (e.g. currently it breaks due to it not expecting inlined frames)
  • Prevent duplicate successors/predecessors
  • Merge True and False instructions into Bool
  • Make sure incremental compilation still works well
  • Ensure generated LLVM is valid
  • Update documentation

Resources

@yorickpeterse yorickpeterse added performance Changes related to improving performance and removed type::performance labels Mar 15, 2023
@yorickpeterse yorickpeterse removed this from the 0.2.5 milestone Mar 15, 2023
@yorickpeterse yorickpeterse added the feature New things to add to Inko, such as a new standard library module label Nov 16, 2023
@yorickpeterse
Copy link
Collaborator Author

To add some context: LLVM doesn't optimize across modules unless you enable link time optimizations. LTO can drastically increase compile times, and doesn't always produce better results. Because of this, if you call an Inko method defined in module A from module B, the call won't get inlined. As we use methods for operators (e.g. Int.+), this means Inko code is much slower than it should be due to the sheer number of function calls.

As a test I hacked together some dirty changes that basically link all LLVM modules together and compile them as a whole, completely ignoring incremental compilation in the process. For https://github.com/jinyus/related_post_gen this results in a 5x improvement of runtime performance when using --opt=aggressive, but of course with an increase in compile times.

My long-term desire is to perform enough inlining and optimizations at the MIR level, such that we can then pass that to LLVM to take care of platform specific optimizations. In such a case it doesn't really matter that LLVM only inlines on a per module basis, as we've already done most of the work (and ideally in parallel as much as possible). This would also benefit any future additional backends, should we ever add those.

@yorickpeterse yorickpeterse added this to the 0.17.0 milestone Aug 16, 2024
@yorickpeterse
Copy link
Collaborator Author

For 0.17.0 we'll start by adding support for fn inline, which guarantees the method is inlined at the call site (provided it's called through static dispatch). We can then start using this for important methods such as Int.+ and other operator methods. The syntax will be as follows:

fn inline foo {}
fn inline mut foo {}
fn pub inline mut foo {}

The inline keyword is only available to module methods, static methods, and instance methods. It can't be used for required methods defined in a trait, but it can be used for default methods. It can't be used for async methods as those use a different calling convention and can't be inlined into the callers due to their async nature.

Inlining is done at the MIR level, allowing us to inline across modules. Inlining is performed after type specialization.

When inlining we need to rewrite register IDs. To do so, we take the maximum register ID of the method we're inlining into and note it down as MAX. We then take the registers of the method to inline, and simply append them to the registers list of the method we're inlining into. We then traverse the MIR of the method to inline, and for each register update its ID such that it's the result of original ID + MAX. For instance methods we need to replace register 0 (the receiver) with the receiver register at the call site. Using this approach we shouldn't need any hash maps of sorts to map old to new registers, making this process a bit more efficient.

@yorickpeterse yorickpeterse self-assigned this Aug 16, 2024
@yorickpeterse
Copy link
Collaborator Author

Note to self: if a method is tagged as fn inline and can't be called through dynamic dispatch (= it's not part of a trait), we should not include it in the final machine code. For fn inline methods that are part of trait implementations we can't do this, because they might be called through dynamic dispatch.

@yorickpeterse
Copy link
Collaborator Author

For inlining we need to take care to flush caches accordingly. For example, if module A defines method M and B calls it, then the decision as to whether M should be inlined or not should update the cache of B accordingly. This has to work both ways: if M previously wasn't inlined into B but now it should be inlined, or if it was inlined before but this is no longer desired, the cache for B should be flushed.

Since multiple methods from A might be inlined into B, simply recording B as depending on A isn't enough, because the code generated for B can now change even if the source code for A remains the same.

One option is this:

When inlining M into B, we record the method ID of M into B, resulting in a list of methods inlined into B. When generating object files, we iterate over this list and obtain the full symbol names. We then sort this list and hash it into the object file's hash code. If the list changes, so does the hash code, resulting in the object cache getting flushed.

In addition, we flag B as depending on A such that if the source of A changes but the inline candidates remain the same, and B doesn't otherwise explicitly depend on A, we still flush the cache accordingly.

yorickpeterse added a commit that referenced this issue Oct 8, 2024
The compiler is now able to inline calls to static and instance methods.
For each method we calculate a rough weight/cost, and calls are inlined
into their callers until the maximum weight is reached. Inlining is done
bottom-up using Tarjan's strongly connected components algorithm,
reducing the amount of duplicate inlining work. Inlining is also done in
a deterministic order as to ensure incremental compilation caches can be
reused as much as possible.

The current inlining threshold is on the conservative end as to not
increase compile times and compile-time memory usage too much. Over time
we may relax this based on user reports and any extra optimization
passes we might add.

If a method is annotated with the `inline` keyword, it's _always_
inlined into the caller regardless of it or the caller's weight. This is
meant to be used when you want to guarantee a method is inlined, such as
for the various operator methods of Int, Float, Bool, etc. Because of
this guarantee one should use it sparingly, as to not increase the
compile time and executable size too much.

This fixes #343.

Changelog: added
yorickpeterse added a commit that referenced this issue Oct 8, 2024
The compiler is now able to inline calls to static and instance methods.
For each method we calculate a rough weight/cost, and calls are inlined
into their callers until the maximum weight is reached. Inlining is done
bottom-up using Tarjan's strongly connected components algorithm,
reducing the amount of duplicate inlining work. Inlining is also done in
a deterministic order as to ensure incremental compilation caches can be
reused as much as possible.

The current inlining threshold is on the conservative end as to not
increase compile times and compile-time memory usage too much. Over time
we may relax this based on user reports and any extra optimization
passes we might add.

If a method is annotated with the `inline` keyword, it's _always_
inlined into the caller regardless of it or the caller's weight. This is
meant to be used when you want to guarantee a method is inlined, such as
for the various operator methods of Int, Float, Bool, etc. Because of
this guarantee one should use it sparingly, as to not increase the
compile time and executable size too much.

This fixes #343.

Changelog: added
yorickpeterse added a commit that referenced this issue Oct 8, 2024
The compiler is now able to inline calls to static and instance methods.
For each method we calculate a rough weight/cost, and calls are inlined
into their callers until the maximum weight is reached. Inlining is done
bottom-up using Tarjan's strongly connected components algorithm,
reducing the amount of duplicate inlining work. Inlining is also done in
a deterministic order as to ensure incremental compilation caches can be
reused as much as possible.

The current inlining threshold is on the conservative end as to not
increase compile times and compile-time memory usage too much. Over time
we may relax this based on user reports and any extra optimization
passes we might add.

If a method is annotated with the `inline` keyword, it's _always_
inlined into the caller regardless of it or the caller's weight. This is
meant to be used when you want to guarantee a method is inlined, such as
for the various operator methods of Int, Float, Bool, etc. Because of
this guarantee one should use it sparingly, as to not increase the
compile time and executable size too much.

This fixes #343.

Changelog: added
yorickpeterse added a commit that referenced this issue Oct 8, 2024
The compiler is now able to inline calls to static and instance methods.
For each method we calculate a rough weight/cost, and calls are inlined
into their callers until the maximum weight is reached. Inlining is done
bottom-up using Tarjan's strongly connected components algorithm,
reducing the amount of duplicate inlining work. Inlining is also done in
a deterministic order as to ensure incremental compilation caches can be
reused as much as possible.

The current inlining threshold is on the conservative end as to not
increase compile times and compile-time memory usage too much. Over time
we may relax this based on user reports and any extra optimization
passes we might add.

If a method is annotated with the `inline` keyword, it's _always_
inlined into the caller regardless of it or the caller's weight. This is
meant to be used when you want to guarantee a method is inlined, such as
for the various operator methods of Int, Float, Bool, etc. Because of
this guarantee one should use it sparingly, as to not increase the
compile time and executable size too much.

This fixes #343.

Changelog: added
yorickpeterse added a commit that referenced this issue Oct 8, 2024
The compiler is now able to inline calls to static and instance methods.
For each method we calculate a rough weight/cost, and calls are inlined
into their callers until the maximum weight is reached. Inlining is done
bottom-up using Tarjan's strongly connected components algorithm,
reducing the amount of duplicate inlining work. Inlining is also done in
a deterministic order as to ensure incremental compilation caches can be
reused as much as possible.

The current inlining threshold is on the conservative end as to not
increase compile times and compile-time memory usage too much. Over time
we may relax this based on user reports and any extra optimization
passes we might add.

If a method is annotated with the `inline` keyword, it's _always_
inlined into the caller regardless of it or the caller's weight. This is
meant to be used when you want to guarantee a method is inlined, such as
for the various operator methods of Int, Float, Bool, etc. Because of
this guarantee one should use it sparingly, as to not increase the
compile time and executable size too much.

This fixes #343.

Changelog: added
yorickpeterse added a commit that referenced this issue Oct 8, 2024
The compiler is now able to inline calls to static and instance methods.
For each method we calculate a rough weight/cost, and calls are inlined
into their callers until the maximum weight is reached. Inlining is done
bottom-up using Tarjan's strongly connected components algorithm,
reducing the amount of duplicate inlining work. Inlining is also done in
a deterministic order as to ensure incremental compilation caches can be
reused as much as possible.

The current inlining threshold is on the conservative end as to not
increase compile times and compile-time memory usage too much. Over time
we may relax this based on user reports and any extra optimization
passes we might add.

If a method is annotated with the `inline` keyword, it's _always_
inlined into the caller regardless of it or the caller's weight. This is
meant to be used when you want to guarantee a method is inlined, such as
for the various operator methods of Int, Float, Bool, etc. Because of
this guarantee one should use it sparingly, as to not increase the
compile time and executable size too much.

This fixes #343.

Changelog: added
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler Changes related to the compiler feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance
Projects
None yet
Development

No branches or pull requests

1 participant