Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pgsrv): Implement start of postgres extended query protocol #117

Merged
merged 12 commits into from
Sep 27, 2022

Conversation

justinrubek
Copy link
Contributor

@justinrubek justinrubek commented Sep 20, 2022

Implements many of the code paths needed for the protocol. There are a lot of things that need worked on. This implements some functionality that can be used against the benchbase tests, but the implementation of many messages is limited:

  • Parse messages create a prepared statement on the session
    • this stores the sql string but doesn't parse it for placeholders
  • Bind messages create a portal on the session
    • the sql statement is planned but no placeholder replacement is performed

@justinrubek
Copy link
Contributor Author

The extended protocol needs the following messages handled

  • Parse
    • Parse statement and find locations to replace for parameters
    • Store inside session for later retrieval
  • Bind
    • plan query
    • Store inside session for later retrieval
  • Execute
    • Run query from portal
  • Describe
    • portal
      • Ddl
      • Write
      • Query
      • Transaction
    • statement
  • Sync
  • Terminate

justinrubek and others added 2 commits September 21, 2022 16:37
* chore: Update to datafusion 12 (#114)

* First draft of key layout using RocksDB (#116)

* First draft of key layout using RocksDB

* Additional future considerations from Sean

* feat: Add information_schema (#115)

* feat: Add information_schema

Fixes #98

Cloud will be making requests directly to the database to get info about the
contents of the database, including schemas, tables, and columns.

* fix: Remove datafusion-proto crate (#119)

We're not using if for anything yet, and this release seems to break building
container images.

```
error: builder for '/nix/store/75lddm4kg8mzn2x5nz8lg36gdj16p7ka-glaredb-cli-0.1.0.drv' failed with exit code 101;
       last 10 log lines:
       > Caused by:
       >   process didn't exit successfully: `/build/source/target/release/build/datafusion-proto-497b9ae0fe438eda/build-script-build` (exit status: 1)
       >   --- stdout
       >   cargo:rerun-if-env-changed=FORCE_REBUILD
       >   cargo:rerun-if-changed=proto/datafusion.proto
       >   Running: "/nix/store/2qg94y58v1jr4dw360bmpxlrs30m31ca-protobuf-3.19.4/bin/protoc" "--include_imports" "--include_source_info" "-o" "/build/prost-buildFXFfZG/prost-descriptor-set" "-I" "proto" "-I" "/nix/store/2qg94y58v1jr4dw360bmpxlrs30m31ca-protobuf-3.19.4/include" "proto/datafusion.proto"
       >
       >   --- stderr
       >   Error: "protobuf compilation failed: Permission denied (os error 13)"
       > warning: build failed, waiting for other jobs to finish...
       For full logs, run 'nix log /nix/store/75lddm4kg8mzn2x5nz8lg36gdj16p7ka-glaredb-cli-0.1.0.drv'.
```

Possibly related: apache/datafusion#3538

* feat: Implement raft via gRPC (#63)

* Replace toy-rpc with tonic gRPC

* implement glaredb cli for raft nodes and client

* current progress

* implement begin, allocate_table, and get_schema

* implement scan

* implement insert

* cleanup

* comment out old tests

* clean up ConsensusClient

* Implement change membership command

* rewrite cluster tests to use RaftClientSource

* add protoc to CI

* switch raft to in-memory implementation

* Remove application logic from raft cluster tests

* cargo fmt

* add tracing to RPC impls

* Remove lemur from raft crate

* remove raft_client example

* Apply suggestions from code review

Co-authored-by: Sean Smith <[email protected]>

* remove protoc from ci

* Remove lemur_impl from raft crate

* Store tonic clients instead of endpoint in ConsensusClient

* use shared n_retries

* Add default num_retries

* Apply suggestions from code review

Co-authored-by: Rustom Shareef <[email protected]>

* moved some mod.rs modules into their parent directories

* implement ConsensusClient retry to find leader using macro

* Fix missing delimiter

* fix clippy issues

* rewrite retry_rpc_on_leader macro to evaluate to an expression

* remove panics in rpc server impls

Co-authored-by: Sean Smith <[email protected]>
Co-authored-by: Rustom Shareef <[email protected]>

* build(nix): Use crane to cache cargo dependencies (#121)

* Add crane

* switch rust toolchain to come from fenix

* touch buildscript before executing cargo build

* add clippy and build checks

* rename arrowstore build script

* rename raft build script

* Send back BindComplete intead of ParseComplete

Also moved sending results into its own function since we need to send results
back after Execute commands complete.

* Add logical plan stub for SETting runtime vars

Also fixes logic for checking pg message length.

Co-authored-by: Rustom Shareef <[email protected]>
Co-authored-by: Justin Rubek <[email protected]>
@justinrubek justinrubek mentioned this pull request Sep 27, 2022
14 tasks
@justinrubek
Copy link
Contributor Author

I've turned #101 into a tracking issue for the bigger picture of getting the extended query protocol in. I think this is may be better done in a few parts

@justinrubek justinrubek changed the title feat(pgsrv): Implement postgres extended query protocol feat(pgsrv): Implement start of postgres extended query protocol Sep 27, 2022
@justinrubek justinrubek marked this pull request as ready for review September 27, 2022 09:09
Copy link
Member

@scsmithr scsmithr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice.

One thing we'll probably want to add once complete is some sqllogictests verifying behavior for the extended protocol.

Comment on lines +155 to +158
let mut param_types = Vec::with_capacity(num_params);
for _ in 0..num_params {
param_types.push(buf.get_i32());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat snippet to use when setting up a vec with known size. Not sure how exactly this compares though with respect to performance.

And not a all necessary, just something I thought I would share.

Suggested change
let mut param_types = Vec::with_capacity(num_params);
for _ in 0..num_params {
param_types.push(buf.get_i32());
}
let param_types = (0..num_params).map(|| buf.get_i32()).collect();

@justinrubek justinrubek merged commit f980994 into main Sep 27, 2022
@greyscaled greyscaled deleted the extended-query-protocol branch December 1, 2022 20:31
scsmithr added a commit that referenced this pull request Nov 20, 2024
…117)

* allow zero aggregates in agg hash table, use for distinct (ALL)

* hash nulls

* nested queries, uncomment some tests

* use single user engine in shell

* provide queries as arg

* int64 timestamps from parquet

* bump version

* additional parquet tests

* file source trait

* read stream for http

* csv decoder

* begin wiring up

* tests

* fix bugs

* Add test, implement stream for wasm

* bump version

* lint

* comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants