Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make ResultMetadata lifetime-generic #1082

Merged
merged 9 commits into from
Oct 7, 2024

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Oct 2, 2024

Ref: #462

This is a necessary step towards non-allocating (in fact less-allocating) result metadata deserialization.

What's done

  • ColumnType is made ownership-generic; there will be two distinct functions to deserialize owned and borrowed ColumnType;
  • ColumnSpec is made ownership-generic; in a follow-up PR, there will be a distinct function to deserialize borrowed Vec<ColumnSpec<'frame>>.
  • ResultMetadata is made ownership-generic;
  • deser_table_spec no longer allocates; it's up to deser_col_specs to allocate or not, depending on desired ownership of deserialized metadata;
  • deser_col_specs verifies that table specs are the same for all columns. We assume it's true, but the CQL protocol does not enforce it. With that check, we will be able to hold only one table spec per query, not one per column; this will reduce memory footprint. TODO in a follow-up.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • [] I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • [ ] I have provided docstrings for the public items that I want to introduce.
  • [ ] I have adjusted the documentation in ./docs/source/.
  • [ ] I added appropriate Fixes: annotations to PR description.

@wprzytula wprzytula added this to the 0.15.0 milestone Oct 2, 2024
@wprzytula wprzytula self-assigned this Oct 2, 2024
@github-actions github-actions bot added the semver-checks-breaking cargo-semver-checks reports that this PR introduces breaking API changes label Oct 2, 2024
Copy link

github-actions bot commented Oct 2, 2024

cargo semver-checks detected some API incompatibilities in this PR.
Checked commit: 80027e2

See the following report for details:

cargo semver-checks output
./scripts/semver-checks.sh --baseline-rev 2ac20a932e4a98c5bf584051e87ca4e2082ca75a
+ cargo semver-checks -p scylla -p scylla-cql --baseline-rev 2ac20a932e4a98c5bf584051e87ca4e2082ca75a
     Cloning 2ac20a932e4a98c5bf584051e87ca4e2082ca75a
     Parsing scylla v0.14.0 (current)
      Parsed [  21.905s] (current)
     Parsing scylla v0.14.0 (baseline)
      Parsed [  20.213s] (baseline)
    Checking scylla v0.14.0 -> v0.14.0 (no change)
     Checked [   0.110s] 89 checks: 89 pass, 0 skip
     Summary no semver update required
    Finished [  42.286s] scylla
     Parsing scylla-cql v0.3.0 (current)
      Parsed [  10.048s] (current)
     Parsing scylla-cql v0.3.0 (baseline)
      Parsed [  10.082s] (baseline)
    Checking scylla-cql v0.3.0 -> v0.3.0 (no change)
     Checked [   0.101s] 89 checks: 87 pass, 2 fail, 0 warn, 0 skip

--- failure struct_pub_field_missing: pub struct's pub field removed or renamed ---

Description:
A publicly-visible struct has at least one public field that is no longer available under its prior name. It may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.35.0/src/lints/struct_pub_field_missing.ron

Failed in:
  field table_spec of struct ColumnSpec, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-2ac20a932e4a98c5bf584051e87ca4e2082ca75a/24ed8fcf1967a9d10c8f621e1201047af4d21ec3/scylla-cql/src/frame/response/result.rs:427
  field name of struct ColumnSpec, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-2ac20a932e4a98c5bf584051e87ca4e2082ca75a/24ed8fcf1967a9d10c8f621e1201047af4d21ec3/scylla-cql/src/frame/response/result.rs:428
  field typ of struct ColumnSpec, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-2ac20a932e4a98c5bf584051e87ca4e2082ca75a/24ed8fcf1967a9d10c8f621e1201047af4d21ec3/scylla-cql/src/frame/response/result.rs:429
  field col_specs of struct ResultMetadata, previously in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/target/semver-checks/git-2ac20a932e4a98c5bf584051e87ca4e2082ca75a/24ed8fcf1967a9d10c8f621e1201047af4d21ec3/scylla-cql/src/frame/response/result.rs:435

--- failure struct_pub_field_now_doc_hidden: pub struct field is now #[doc(hidden)] ---

Description:
A pub field of a pub struct is now marked #[doc(hidden)] and is no longer part of the public API.
        ref: https://doc.rust-lang.org/rustdoc/write-documentation/the-doc-attribute.html#hidden
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.35.0/src/lints/struct_pub_field_now_doc_hidden.ron

Failed in:
  field ColumnSpec.table_spec in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/scylla-cql/src/frame/response/result.rs:475
  field ColumnSpec.name in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/scylla-cql/src/frame/response/result.rs:475
  field ColumnSpec.typ in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/scylla-cql/src/frame/response/result.rs:475
  field ResultMetadata.col_specs in file /home/runner/work/scylla-rust-driver/scylla-rust-driver/scylla-cql/src/frame/response/result.rs:523

     Summary semver requires new major version: 2 major and 0 minor checks failed
    Finished [  20.286s] scylla-cql
make: *** [Makefile:61: semver-rev] Error 1

@wprzytula wprzytula force-pushed the column-spec-lifetime-generic branch from fddd5c7 to 86e2aa8 Compare October 2, 2024 11:49
scylla-cql/src/frame/response/result.rs Show resolved Hide resolved
scylla/src/statement/prepared_statement.rs Show resolved Hide resolved
scylla-cql/src/frame/response/result.rs Outdated Show resolved Hide resolved
scylla-cql/src/frame/response/result.rs Outdated Show resolved Hide resolved
scylla-cql/src/frame/response/result.rs Outdated Show resolved Hide resolved
scylla-cql/src/frame/response/result.rs Outdated Show resolved Hide resolved
Comment on lines +81 to +128
impl<'frame> ColumnType<'frame> {
pub fn into_owned(self) -> ColumnType<'static> {
match self {
ColumnType::Custom(cow) => ColumnType::Custom(cow.into_owned().into()),
ColumnType::Ascii => ColumnType::Ascii,
ColumnType::Boolean => ColumnType::Boolean,
ColumnType::Blob => ColumnType::Blob,
ColumnType::Counter => ColumnType::Counter,
ColumnType::Date => ColumnType::Date,
ColumnType::Decimal => ColumnType::Decimal,
ColumnType::Double => ColumnType::Double,
ColumnType::Duration => ColumnType::Duration,
ColumnType::Float => ColumnType::Float,
ColumnType::Int => ColumnType::Int,
ColumnType::BigInt => ColumnType::BigInt,
ColumnType::Text => ColumnType::Text,
ColumnType::Timestamp => ColumnType::Timestamp,
ColumnType::Inet => ColumnType::Inet,
ColumnType::List(elem_type) => ColumnType::List(Box::new(elem_type.into_owned())),
ColumnType::Map(key_type, value_type) => ColumnType::Map(
Box::new(key_type.into_owned()),
Box::new(value_type.into_owned()),
),
ColumnType::Set(elem_type) => ColumnType::Set(Box::new(elem_type.into_owned())),
ColumnType::UserDefinedType {
type_name,
keyspace,
field_types,
} => ColumnType::UserDefinedType {
type_name: type_name.into_owned().into(),
keyspace: keyspace.into_owned().into(),
field_types: field_types
.into_iter()
.map(|(cow, column_type)| (cow.into_owned().into(), column_type.into_owned()))
.collect(),
},
ColumnType::SmallInt => ColumnType::SmallInt,
ColumnType::TinyInt => ColumnType::TinyInt,
ColumnType::Time => ColumnType::Time,
ColumnType::Timeuuid => ColumnType::Timeuuid,
ColumnType::Tuple(vec) => {
ColumnType::Tuple(vec.into_iter().map(ColumnType::into_owned).collect())
}
ColumnType::Uuid => ColumnType::Uuid,
ColumnType::Varint => ColumnType::Varint,
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In many places in this impl you take Box<ColumnType<'frame>>, call .into_owned() on Box contents (thus dropping the Box) and then passing the result to Box::new - this looks like unnecessary reallocation. I wonder if there is a way to avoid it because it is a big waste.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK this function is only going to be used in error conditions, to create owned ColumnType for use in error types, so the waste does not matter.

scylla-cql/src/frame/response/result.rs Outdated Show resolved Hide resolved
scylla-cql/src/frame/response/result.rs Show resolved Hide resolved
Comment on lines 1079 to 1084
);
}

fn col_spec(name: &str, typ: ColumnType) -> ColumnSpec {
fn col_spec(name: &str, typ: ColumnType<'static>) -> ColumnSpec {
ColumnSpec {
table_spec: TableSpec::owned("ks".to_string(), "tbl".to_string()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about introducing type alias ColumnTypeOwned? It would be shorter to type and imo easier to read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be done with any lifetime-generic type that we have in our codebase, and there are quite a number of them. Should we introduce such an alias for each of them?
I'm a bit reluctant about type aliases, because they tend to hide that the underlying type is the same yet differs wrt ownership. The thing is, we want to expose that fact.

Comment on lines 1148 to 1156
/// - `Some(Some(...))` - non-null, present value
pub struct UdtIterator<'frame> {
all_fields: &'frame [(String, ColumnType)],
all_fields: &'frame [(Cow<'frame, str>, ColumnType<'frame>)],
type_name: &'frame str,
keyspace: &'frame str,
remaining_fields: &'frame [(String, ColumnType)],
remaining_fields: &'frame [(Cow<'frame, str>, ColumnType<'frame>)],
raw_iter: BytesSequenceIterator<'frame>,
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do those Cows have to be propagated to this files? Is there e problem with &'frame str?

Copy link
Collaborator Author

@wprzytula wprzytula Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid question. ColumnType::UserDefinedType contains Vec<(Cow<'frame, str>, ColumnType<'frame>)>, so when borrowing from that Vec, we get &'frame [(Cow<'frame, str>, ColumnType<'frame>)], which can't be cast to [(&'frame str, ColumnType<'frame>)]. So, the answer is that Cows have to be propagated.

Comment on lines 722 to 728
buf: &mut &[u8],
global_table_spec: &Option<TableSpec<'static>>,
col_count: usize,
) -> StdResult<Vec<ColumnSpec>, ColumnSpecParseError> {
) -> StdResult<Vec<ColumnSpec<'static>>, ColumnSpecParseError> {
let mut col_specs = Vec::with_capacity(col_count);
for col_idx in 0..col_count {
let table_spec = if let Some(spec) = global_table_spec {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same ask for ColumnSpec: can we have ColumnSpecOwned alias instead of typing <'static> everywhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(already answered elsewhere)

scylla/src/statement/prepared_statement.rs Outdated Show resolved Hide resolved
scylla-cql/src/frame/response/result.rs Show resolved Hide resolved
scylla-cql/src/frame/response/result.rs Outdated Show resolved Hide resolved
It makes sense to have public getters, but crate-private fields. Thanks
to that, in the future we will be able to change underlying data
representation, e.g. use smart pointers like Arc or Cow, retaining API
compatibility.
This is a necessary step towards non-allocating (in fact
less-allocating) result metadata deserialization.
This enables us creating borrowed ColumnSpec that do not allocate,
which is going to be used in further refactor for lazy metadata
deserialization.
This hides internal implementation, enabling its alterations in next
commits and possibly in the future.
This is a necessary step towards non-allocating (in fact
less-allocating) result metadata deserialization.
Those functions are close to each other logically, so let's keep them
close in code, too, for easier reading.
This is a step towards non-allocating result metadata deserialization.
The allocation is moved out, to deser_col_specs.
We assume that for each column, table spec is the same. As this is not
guaranteed by the CQL protocol specification but only by how Cassandra
and ScyllaDB work (no support for joins), we perform a sanity check.
@wprzytula wprzytula force-pushed the column-spec-lifetime-generic branch from 86e2aa8 to 80027e2 Compare October 7, 2024 11:00
@wprzytula
Copy link
Collaborator Author

v2.0:

  • refactored deser_table_spec/deser_col_specs tandem again to remove unwrap() and for more sense in responsibility division,
  • addressed comments.

@wprzytula wprzytula changed the title make ResultMetadata lifetime generic make ResultMetadata lifetime-generic Oct 7, 2024
@wprzytula
Copy link
Collaborator Author

In the PR description, I listed out the changes introduced.

@wprzytula wprzytula merged commit 92254fe into scylladb:main Oct 7, 2024
11 checks passed
@wprzytula wprzytula deleted the column-spec-lifetime-generic branch October 7, 2024 14:08
@wprzytula wprzytula mentioned this pull request Nov 6, 2024
8 tasks
@wprzytula wprzytula mentioned this pull request Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/deserialization semver-checks-breaking cargo-semver-checks reports that this PR introduces breaking API changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants