-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrating Column.map
and Column.zip
to Java-to-Enso callbacks
#7371
Labels
-libs
Libraries: New libraries to be implemented
Comments
Duplicate of #7317 |
Ah right! Sorry, I forgot about the other one. But I will then reproduce the text from this issue as it introduces some important comments we need to keep in mind. |
mergify bot
pushed a commit
that referenced
this issue
Jul 25, 2023
- Fixes #7231 - Cleans up vectorized operations to distinguish unary and binary operations. - Introduces MixedStorage which may pretend to be a more specialized storage on demand. - Ensures that operations request a more specialized storage on right-hand side to ensure compatibility with reported inferred storage type. - Ensures that a dataflow error returned by an Enso callback in Java is propagated as a polyglot exception and can be caught back in Enso - Tests for comparison of Mixed storages with each other and other types - Started using `Set` for `Filter_Condition.Is_In` for better performance. - ~~Migrated `Column.map` and `Column.zip` to use the Java-to-Enso callbacks.~~ - This does not forward warnings. IMO we should not be losing them. We can switch and add a ticket to fix the warnings, but that would be a regression (current implementation handles them correctly). Instead, we should first gain some ability to work with warnings in polyglot. I created a ticket to get this figured out #7371 - ~~Trying to avoid conversions when calling Enso functions from Java.~~ - Needs extra care as dataflow errors may not be handled right then. So only works for simple functions that should not error. - Not sure how much it really helps. [Benchmarks](#7270 (comment)) suggested it could improve the performance quite significantly, but the practical solution is not exactly the same as the one measured, so we may have to measure and tune it to get the best results. - Created #7378 to track this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The benchmarks are showing that the current approach used for
Column.map
andColumn.zip
is not that efficient - the old approach relying on passing an Enso callback to the Java code is 2-6x faster than the current one.So we wanted to migrate that, but in #7368 I realised we do not have tests for propagating panics/errors/warnings through these callbacks.
With some work, dataflow errors are propagated just like panics. However, currently, the polyglot helpers in Java have no way to tell if a returned value has warnings attached.
Thus, if we just switched without some additional logic, any warnings reported inside of the function passed to
Column.map/zip
would be lost. That does not seem right and it would be a regression from the current state (currently the warnings are attached to the created resulting column).So to be able to migrate this, we need to figure out a way to detect and gather the warnings returned from Enso functions called from within Java.
This will make the logic a bit more complicated, so we may need to actually revise the benchmarks and see if the added complexity of handling warnings does not thwart the expected gains from switching to this different approach.
The text was updated successfully, but these errors were encountered: