Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust)!: add workspace with generated bindings crate #1326

Closed
wants to merge 11 commits into from

Conversation

mbrobbel
Copy link
Contributor

@mbrobbel mbrobbel commented Nov 27, 2023

The main goal of this PR is to start making some changes that should help to get #1305 ready for review. I'll update that PR later to be compatible with the changes introduced here.

A summary of the changes:

  • Add a Cargo workspace for the Rust implementation of ADBC, this allow having separate crates for different components (like drivers).
  • Add a bindings crate arrow-adbc-sys to the workspace that auto generates bindings based on adbc.h using bindgen when the generate features is enabled. There is a symbolic link to the adbc.h header in the bindings crate. I noticed that the go lib has a copy that is checked to be in sync with pre-commit, if that is preferred I can update. CI checks if the checked-in bindings are in sync with adbc.h header.
  • Update some existing definitions to use the generated bindings.
  • Add dependabot config to keep Rust dependencies up-to-date.
  • Remove usage of non-approved GitHub actions for Rust workflow.
  • Check in Cargo.lock file following this blog post.
  • Remove an unused dev dependency.
  • Fixed some Clippy warnings.
  • Updated dependencies.
  • Remove Adbc prefix from item names.
  • Remove deprecated function from std::error::Error impl, added a Result type.
  • Added some missing ingest options.

I plan on updating the trait definitions in a follow-up PR to provide both sync and async versions.

@lidavidm
Copy link
Member

This makes the ADBC core traits depend on the bindings right? I think we don't want to do that to allow for pure-Rust drivers.

@@ -64,25 +64,26 @@ pub mod objects;

use std::collections::HashMap;

use arrow_adbc_sys as adbc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular here - we don't want this dependency from the core ADBC to the bindings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise; the bindings here are generated items at build time based on adbc.h. The items are structs (e.g. AdbcConnection, AdbcDatabase), constants (e.g. ADBC_VERSION_1_1_0), functions (e.g. AdbcDatabaseInit) and type aliases (e.g. AdbcStatusCode).

We could choose to put some of the defintions behind a ffi feature, but I think always using the constants from the header is sensible, considering adbc.h the single source of truth for these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be acceptable to pure-Rust users to have this dependency, even at build time? (Does bindgen etc require a C compiler?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: there's no requirement to have Rust mirror C 1:1. Staying close to the general structure to ease FFI makes sense, but you could absolutely choose to have pure-Rust definitions and translate to/from FFI definitions purely in the FFI code, instead of forcing all definitions to be identical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. bindgen uses clang to parse and analyze headers.

I think it's fine but others may disagree. I can revert this and keep copies of those definitions in Rust to prevent that dependency, and put the generated headers behind an ffi feature flag.

Or an alternative approach: check in the generated bindings (and regenerate when adbc.h changes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but openssl is the canonical example of this: crates often have a feature to use rustls instead to avoid the native dependency, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, although I've never had issues with openssl that caused me to move to rustls. 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is locking us into a path that will be hard to back out of. I suppose if it's only constant values then we can always copy them over later and break the dependency, but beyond that it may get painful if it turns out that pure-Rust is a major user concern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the bindings crate to only generate when the generate feature is enabled (non-default), removing the clang dependency for the main crate. This then updates the checked-in bindings. CI checks if the bindings are up-to-date.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree here: we don't want dependency from ADBC-Rust API to the bindings. It prevents pure-rust ADBC implementations.

General rule would be that ADBC-Rust API should not to contain anything C API specific.

@@ -35,17 +35,17 @@
//! There are two flavors of ADBC that this library supports:
//!
//! * **Native Rust implementations**. These implement the traits at the top level of
//! this crate, starting with [AdbcDatabase].
//! this crate, starting with [Database].
//! * **C API ADBC drivers**. These can be implemented in any language (that compiles
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And really the core definitions should not reference the C API at all

@mbrobbel mbrobbel marked this pull request as draft November 27, 2023 16:52
@mbrobbel mbrobbel marked this pull request as ready for review November 28, 2023 09:52
@mbrobbel mbrobbel changed the title feat(rust)!: add workspace with auto-generated bindings crate feat(rust)!: add workspace with generated bindings crate Nov 28, 2023
@lidavidm lidavidm removed this from the ADBC Libraries 0.9.0 milestone Dec 19, 2023
@mbrobbel
Copy link
Contributor Author

Closing in favor of #1723.

@mbrobbel mbrobbel closed this Apr 16, 2024
@mbrobbel mbrobbel deleted the rust-workspace branch April 16, 2024 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants