Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove overhead of calls from Table java code into Enso code by refactoring the functionality to Enso #6292

Open
Akirathan opened this issue Apr 14, 2023 · 5 comments
Labels
--low-performance -compiler -libs Libraries: New libraries to be implemented l-apache-arrow InMemory Table move to Apache Arrow x-on-hold

Comments

@Akirathan
Copy link
Member

There is "Table.order_by object" benchmark that creates a table consisting solely of My atoms with custom My_Comparator and most of the time is spent in ObjectComparator.ensoCompare which calls back into Enso from Java across a boundary.

The simplest, and quickest possible solution to speed up the performance is to move some of the functionality, that is currently implemented in org.enso.base.table Java package into Enso such that these kinds of callbacks are no longer necessary.

After moving the functionality to Enso, it is possible that there may not be a need for a shared code between libs and runtime anymore (#5259).

@radeusgd
Copy link
Member

This should include moving the callback part of the MultiValueIndex and other MultiValueKey methods to Enso too, so that we avoid all Java-to-Enso callbacks in the Table library.

@jdunkerley jdunkerley added this to the Beta Release milestone Apr 18, 2023
@jdunkerley jdunkerley moved this from ❓New to 📤 Backlog in Issues Board Apr 18, 2023
@radeusgd
Copy link
Member

Once we move the MultiValueIndex to Enso, we should implement a table.is_unique columns which can be used for a more efficient check of primary_key uniqueness condition in Upload_Table.

@radeusgd
Copy link
Member

First steps towards this done in #6890

mergify bot pushed a commit that referenced this issue Jun 2, 2023
Closes #5227

# Important Notes
- This lays first steps towards #6292 - we get pure Enso variants of MultiValueKey.
- Another part refactors `LongStorage` into `AbstractLongStorage` allowing it to provide alternative implementations of the underlying storage, in our case `LongRangeStorage` generating the values ad-hoc and `LongConstantStorage` - currently unused but in the future it can be adapted to support constant columns (once we implement similar facilities for other types).
@jdunkerley
Copy link
Member

PR #7270

@jdunkerley
Copy link
Member

This is on hold and should be tackled as we work with the storage refactor. If we move to Apache Arrow this could be essential.

@jdunkerley jdunkerley added x-on-hold l-apache-arrow InMemory Table move to Apache Arrow labels Sep 5, 2023
@jdunkerley jdunkerley moved this from 📤 Backlog to ❓New in Issues Board Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
--low-performance -compiler -libs Libraries: New libraries to be implemented l-apache-arrow InMemory Table move to Apache Arrow x-on-hold
Projects
Status: New
Development

No branches or pull requests

3 participants