Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Improve error handling when native lib fails to load #1000

Merged
merged 2 commits into from
Oct 23, 2024

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented Oct 7, 2024

Which issue does this PR close?

Closes #999

Rationale for this change

If the static initialization block failed to load the library then we did not see the reason why (in some cases).

Before:

│ 24/10/07 22:42:11 WARN CometSparkSessionExtensions: Comet extension is disabled because of error when loading native lib. Falling back to Spark                                                                                                   │
│ java.lang.NoClassDefFoundError: Could not initialize class org.apache.comet.NativeBase  

After:

│ 24/10/22 15:00:08 WARN CometSparkSessionExtensions: Comet extension is disabled because of error when loading native lib. Falling back to Spark                                                                                                          │
│ java.lang.UnsatisfiedLinkError: /tmp/libcomet-11952721851997920414.so: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.35' not found (required by /tmp/libcomet-11952721851997920414.so)      

What changes are included in this PR?

Improve error handling by consuming exceptions in the static init code and then rethrowing them when isCometEnabled calls isLibraryLoaded

How are these changes tested?

Manually.

@andygrove andygrove changed the title Remove NativeBase static initializer chore: Remove NativeBase static initializer (to improve error handling when native lib fails to load) Oct 7, 2024
@andygrove
Copy link
Member Author

@parthchandra could you review?

Copy link
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Some asking for one minor clarification.

@@ -1105,7 +1105,8 @@ object CometSparkSessionExtensions extends Logging {
try {
// This will load the Comet native lib on demand, and if success, should set
// `NativeBase.loaded` to true
NativeBase.isLoaded
NativeBase.load()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised we have to do this. I always understood that the static initializer would be called before we can invoke a static method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this PR was to remove the static initializer and explicitly load the native library so that we can catch exceptions and report them.

@@ -24,6 +24,13 @@
import org.apache.comet.NativeBase;

public final class Native extends NativeBase {

static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for moving this from the base class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that we still rely on the static initializer for the native Parquet code, which I had not realized when I started on this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, if only scan is enabled, can we still catch the exception whiling unable to load the native lib?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could potentially catch UnsatisfiedLinkError (and other exceptions) in NativeBase.bundleLoadLibrary or NativeBase.load. But this being static initialization, I'm not sure if the logger would have been initialized at that point.

@andygrove andygrove marked this pull request as draft October 8, 2024 00:14
@andygrove andygrove marked this pull request as ready for review October 22, 2024 15:02
@andygrove
Copy link
Member Author

@parthchandra @viirya I reimplemented this with a much simpler approach. PTAL when you can.

private static final String searchPattern = "libcomet-";

static {
if (!isLoaded()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load is a no-op if loaded == false so no need for this conditional here

}
}

public static synchronized boolean isLoaded() {
public static synchronized boolean isLoaded() throws Throwable {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called from isCometEnabled and we already had error handling here which could never be triggered due to isLoaded never being capable of throwing an exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: calling isLoaded could have resulted in the NoClassDefFoundError but this did not have details of root cause.

@andygrove andygrove changed the title chore: Remove NativeBase static initializer (to improve error handling when native lib fails to load) chore: Improve error handling when native lib fails to load Oct 22, 2024
Copy link
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, this is much cleaner.

@andygrove andygrove merged commit 845b654 into apache:main Oct 23, 2024
74 checks passed
@andygrove andygrove deleted the remove-static-init branch October 23, 2024 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plugin can fail to initialize native library and hide the root cause
4 participants