Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Consider exploring JIT compilation/LTO to replace AST evaluation #15366

Open
jrhemstad opened this issue Mar 21, 2024 · 0 comments
Open
Labels
feature request New feature or request

Comments

@jrhemstad
Copy link
Contributor

Is your feature request related to a problem? Please describe.
The AST machinery is fundamentally limited in the performance that we can achieve. We should explore using JIT compilation/linking instead.

Describe the solution you'd like

We'd explored JIT compilation in the past, but was too slow at the time. A number of things have changed since then:

  • NVRTC has made significant improvements in runtime compilation (150ms -> 25ms fixed overhead)
  • JIT LTO is a thing now
  • NVRTC supports pre-compiled headers now

All of these things contribute to the potential for significantly faster runtime compilation.

The basic idea would be we pre-compile the mixed join kernel and treat the equality comparator like an extern function.

Then at runtime, we JIT compile only the comparator. Then we JIT LTO the comparator into the kernel to avoid the cost of the extern function call.

That way we aren't JIT compiling the entire kernel, which should further reduce the runtime cost.
We could even do this without forcing any user-facing changes. We could take the expression tree that a user gives us today and translate that into a string of C++ code that does the operation expressed by the AST.

Furthermore, by pre-compiling the relevant headers, we can further reduce the runtime costs.

Additional context

There's potential for extending this idea beyond AST stuff.

Any of the places where we're currently dispatching to different comparator instantiations based on the presence of nested types would also be a prime target for JIT compilation/LTO.

The primary benefit there would mostly be compile time/binary size reduction to avoid statically instantiating as many independent code paths.

There could be opportunity for performance benefits as well by JIT compiling for only the exact types needed and eliminating the type dispatcher from the critical path in the row-based operators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Status: To be revisited
Development

No branches or pull requests

2 participants