Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete by prefix operator in kvdb #360

Merged
merged 29 commits into from
Mar 27, 2020
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
616b401
Switch from `parity-rocksdb` to upstream `rust-rocksdb`
bkchr Dec 4, 2018
b0317f6
Update to latest rocksdb
gnunicorn Jan 11, 2019
b6cf86b
Delete prefix as range exept when start is full of 255 (corner case).
cheme Aug 13, 2019
9ce83b0
Merge commit 'b0317f649ab2c665b7987b8475878fc4d2e1f81d' into upgraded…
cheme Aug 13, 2019
6e4f390
forcing ignore lifetime (not more unsafe than before, columnfamily is
cheme Aug 14, 2019
168b103
Merge branch 'master' into upgraded_del_range, largely update previous
cheme Mar 12, 2020
6495423
remove temporarilly path deps for easier patch usage.
cheme Mar 12, 2020
dac8aa1
being a bit more exhaustive
cheme Mar 12, 2020
a3a2a54
Tests for delete prefix, fix rocksdb full column delete.
cheme Mar 13, 2020
a2b481c
revert util-mem
cheme Mar 13, 2020
81e460e
update versionning.
cheme Mar 13, 2020
2018880
Merge branch 'master' into upgraded_del_range_pr
cheme Mar 26, 2020
27a3828
better end prefix from review
cheme Mar 26, 2020
1f60726
revert test error check
cheme Mar 26, 2020
2eb1fd8
Update kvdb-memorydb/src/lib.rs
cheme Mar 26, 2020
ca2269f
Update kvdb-shared-tests/src/lib.rs
cheme Mar 26, 2020
fcc4e58
Update kvdb/src/lib.rs
cheme Mar 26, 2020
34b1e0b
Update kvdb/src/lib.rs
cheme Mar 26, 2020
5a892f8
applying suggestions, and remove delete by prefix method.
cheme Mar 26, 2020
1cedb14
Merge branch 'upgraded_del_range_pr' of github.com:cheme/parity-commo…
cheme Mar 26, 2020
da49663
io stats num column
cheme Mar 26, 2020
05d9469
end_prefix test
cheme Mar 26, 2020
20bac79
format
cheme Mar 26, 2020
815d6ca
Redundant delete.
cheme Mar 26, 2020
1564f5f
Update kvdb/src/lib.rs
cheme Mar 27, 2020
904ebae
Update kvdb/src/lib.rs
cheme Mar 27, 2020
2456135
Documentation fix and additional test case in end_prefix_test
cheme Mar 27, 2020
1f18d0f
Doc.
cheme Mar 27, 2020
9e33a73
doc
cheme Mar 27, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion kvdb-memorydb/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ parking_lot = "0.10.0"
kvdb = { version = "0.5", path = "../kvdb" }

[dev-dependencies]
kvdb-shared-tests = { path = "../kvdb-shared-tests", version = "0.2" }
kvdb-shared-tests = { path = "../kvdb-shared-tests", version = "0.3" }
44 changes: 37 additions & 7 deletions kvdb-memorydb/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,24 @@ impl KeyValueDB for InMemory {
if let Some(col) = columns.get_mut(&col) {
col.remove(&*key);
}
}
},
DBOp::DeletePrefix { col, prefix } => {
if let Some(col) = columns.get_mut(&col) {
use std::ops::Bound;
if prefix.len() == 0 {
cheme marked this conversation as resolved.
Show resolved Hide resolved
col.clear();
} else {
let start_range = Bound::Included(prefix.to_vec());
let end_range = Bound::Excluded(kvdb::end_prefix(&prefix[..]));
let keys: Vec<_> = col.range((start_range, end_range))
.map(|(k, _)| k.clone())
.collect();
for key in keys.into_iter() {
col.remove(&key[..]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be optimized by using split_off and append, but currently append is slow: rust-lang/rust#34666 and I think memorydb is only used for tests and kvdb-web, so it's fine for now

}
}
}
},
}
}
Ok(())
Expand Down Expand Up @@ -112,36 +129,49 @@ mod tests {
#[test]
fn get_fails_with_non_existing_column() -> io::Result<()> {
let db = create(1);
st::test_get_fails_with_non_existing_column(&db)
st::test_get_fails_with_non_existing_column(&db)?;
Ok(())
}

#[test]
fn put_and_get() -> io::Result<()> {
let db = create(1);
st::test_put_and_get(&db)
st::test_put_and_get(&db)?;
Ok(())
}

#[test]
fn delete_and_get() -> io::Result<()> {
let db = create(1);
st::test_delete_and_get(&db)
st::test_delete_and_get(&db)?;
Ok(())
}

#[test]
fn delete_prefix() -> io::Result<()> {
let db = create(st::NB_DELETE_PREFIX_TESTS);
st::test_delete_prefix(&db)?;
Ok(())
}

#[test]
fn iter() -> io::Result<()> {
let db = create(1);
st::test_iter(&db)
st::test_iter(&db)?;
Ok(())
}

#[test]
fn iter_from_prefix() -> io::Result<()> {
let db = create(1);
st::test_iter_from_prefix(&db)
st::test_iter_from_prefix(&db)?;
Ok(())
}

#[test]
fn complex() -> io::Result<()> {
let db = create(1);
st::test_complex(&db)
st::test_complex(&db)?;
Ok(())
}
}
2 changes: 1 addition & 1 deletion kvdb-rocksdb/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ parity-util-mem = { path = "../parity-util-mem", version = "0.6", default-featur
alloc_counter = "0.0.4"
criterion = "0.3"
ethereum-types = { path = "../ethereum-types" }
kvdb-shared-tests = { path = "../kvdb-shared-tests", version = "0.2" }
kvdb-shared-tests = { path = "../kvdb-shared-tests", version = "0.3" }
rand = "0.7.2"
tempdir = "0.3.7"
keccak-hash = { path = "../keccak-hash" }
Expand Down
40 changes: 33 additions & 7 deletions kvdb-rocksdb/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,18 @@ impl Database {
stats_total_bytes += key.len();
batch.delete_cf(cf, &key).map_err(other_io_err)?
}
DBOp::DeletePrefix { col: _, prefix } => {
if prefix.len() > 0 {
let end_range = kvdb::end_prefix(&prefix[..]);
batch.delete_range_cf(cf, &prefix[..], &end_range[..]).map_err(other_io_err)?;
} else {
// delete the whole column
let end_range = &[255u8];
ordian marked this conversation as resolved.
Show resolved Hide resolved
batch.delete_range_cf(cf, &prefix[..], &end_range[..]).map_err(other_io_err)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to delete a range, if we're deleting the whole column later?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not delete the whole column (at first I tried to but it requires mutable access and some changes overall). So I end up thinking that if we use the delete range to delete the column we are doing something wrong, so using delete range for it felt more consistent.
(I will change the comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
batch.delete_range_cf(cf, &prefix[..], &end_range[..]).map_err(other_io_err)?;
batch.delete_range_cf(cf, &[], &end_range[..]).map_err(other_io_err)?;

batch.delete_cf(cf, &end_range[..]).map_err(other_io_err)?;
batch.delete_cf(cf, &[]).map_err(other_io_err)?;
ordian marked this conversation as resolved.
Show resolved Hide resolved
}
}
};
}
self.stats.tally_bytes_written(stats_total_bytes as u64);
Expand Down Expand Up @@ -690,43 +702,57 @@ mod tests {
#[test]
fn get_fails_with_non_existing_column() -> io::Result<()> {
let db = create(1)?;
st::test_get_fails_with_non_existing_column(&db)
st::test_get_fails_with_non_existing_column(&db)?;
Ok(())
ordian marked this conversation as resolved.
Show resolved Hide resolved
}

#[test]
fn put_and_get() -> io::Result<()> {
let db = create(1)?;
st::test_put_and_get(&db)
st::test_put_and_get(&db)?;
Ok(())
}

#[test]
fn delete_and_get() -> io::Result<()> {
let db = create(1)?;
st::test_delete_and_get(&db)
st::test_delete_and_get(&db)?;
Ok(())
}

#[test]
fn delete_prefix() -> io::Result<()> {
let db = create(st::NB_DELETE_PREFIX_TESTS)?;
st::test_delete_prefix(&db)?;
Ok(())
}

#[test]
fn iter() -> io::Result<()> {
let db = create(1)?;
st::test_iter(&db)
st::test_iter(&db)?;
Ok(())
}

#[test]
fn iter_from_prefix() -> io::Result<()> {
let db = create(1)?;
st::test_iter_from_prefix(&db)
st::test_iter_from_prefix(&db)?;
Ok(())
}

#[test]
fn complex() -> io::Result<()> {
let db = create(1)?;
st::test_complex(&db)
st::test_complex(&db)?;
Ok(())
}

#[test]
fn stats() -> io::Result<()> {
let db = create(3)?;
st::test_io_stats(&db)
st::test_io_stats(&db)?;
Ok(())
}

#[test]
Expand Down
2 changes: 1 addition & 1 deletion kvdb-shared-tests/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "kvdb-shared-tests"
version = "0.2.0"
version = "0.3.0"
authors = ["Parity Technologies <[email protected]>"]
edition = "2018"
description = "Shared tests for kvdb functionality, to be executed against actual implementations"
Expand Down
56 changes: 56 additions & 0 deletions kvdb-shared-tests/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,62 @@ pub fn test_io_stats(db: &dyn KeyValueDB) -> io::Result<()> {
Ok(())
}

pub const NB_DELETE_PREFIX_TESTS: u32 = 5;
cheme marked this conversation as resolved.
Show resolved Hide resolved

/// A test for `KeyValueDB::delete_prefix`.
pub fn test_delete_prefix(db: &dyn KeyValueDB) -> io::Result<()> {
let keys = [
&[][..],
&[0u8][..],
&[0, 1][..],
&[1][..],
&[1, 0][..],
&[1, 255][..],
&[1, 255, 255][..],
&[2][..],
&[2, 0][..],
&[2, 255][..],
];
let init_db = |ix: u32| -> io::Result<()> {
let mut batch = db.transaction();
for (i, key) in keys.iter().enumerate() {
batch.put(ix, key, &[i as u8]);
}
db.write(batch)?;
Ok(())
};
let check_db = |ix: u32, content: [bool; 10]| -> io::Result<()> {
let mut state = [true; 10];
for (c, key) in keys.iter().enumerate() {
state[c] = db.get(ix, key)?.is_some();
}
assert_eq!(state, content, "at {}", ix);
Ok(())
};
let tests: [_; NB_DELETE_PREFIX_TESTS as usize] = [
// standard
(&[1u8][..], [true, true, true, false, false, false, false, true, true, true]),
// edge
(&[1u8, 255, 255][..], [true, true, true, true, true, true, false, true, true, true]),
// none 1
(&[1, 2][..], [true, true, true, true, true, true, true, true, true, true]),
// none 2
(&[8][..], [true, true, true, true, true, true, true, true, true, true]),
// all
(&[][..], [false, false, false, false, false, false, false, false, false, false]),
];
for (ix, test) in tests.iter().enumerate() {
let ix = ix as u32;
init_db(ix)?;
let mut batch = db.transaction();
batch.delete_prefix(ix, test.0);
db.write(batch)?;
check_db(ix, test.1)?;
}

Ok(())
}

/// A complex test.
pub fn test_complex(db: &dyn KeyValueDB) -> io::Result<()> {
let key1 = b"02c69be41d0b7e40352fc85be1cd65eb03d40ef8427a0ca4596b1ead9a00e9fc";
Expand Down
3 changes: 2 additions & 1 deletion kvdb-web/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,12 @@ features = [
'EventTarget',
'IdbCursor',
'IdbCursorWithValue',
'IdbKeyRange',
'DomStringList',
]

[dev-dependencies]
console_log = "0.1.2"
kvdb-shared-tests = { path = "../kvdb-shared-tests", version = "0.2" }
kvdb-shared-tests = { path = "../kvdb-shared-tests", version = "0.3" }
wasm-bindgen-test = "0.3.4"
wasm-bindgen-futures = "0.4.4"
15 changes: 14 additions & 1 deletion kvdb-web/src/indexed_db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

use js_sys::{Array, ArrayBuffer, Uint8Array};
use wasm_bindgen::{closure::Closure, JsCast, JsValue};
use web_sys::{Event, IdbCursorWithValue, IdbDatabase, IdbOpenDbRequest, IdbRequest, IdbTransactionMode};
use web_sys::{Event, IdbCursorWithValue, IdbDatabase, IdbOpenDbRequest, IdbRequest, IdbTransactionMode, IdbKeyRange};

use futures::channel;
use futures::prelude::*;
Expand Down Expand Up @@ -157,6 +157,19 @@ pub fn idb_commit_transaction(idb: &IdbDatabase, txn: &DBTransaction, columns: u
warn!("error deleting key from col_{}: {:?}", column, err);
}
}
DBOp::DeletePrefix { col, prefix } => {
let column = *col as usize;
// Convert rust bytes to js arrays
let prefix_js_start = Uint8Array::from(prefix.as_ref());
let prefix_js_end = Uint8Array::from(prefix.as_ref());

let range = IdbKeyRange::bound(prefix_js_start.as_ref(), prefix_js_end.as_ref())
.expect("Starting and ending at same value is valid bound; qed");
let res = object_stores[column].delete(range.as_ref());
if let Err(err) = res {
warn!("error deleting prefix from col_{}: {:?}", column, err);
}
}
}
}

Expand Down
6 changes: 6 additions & 0 deletions kvdb-web/tests/indexed_db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,12 @@ async fn delete_and_get() {
st::test_delete_and_get(&db).unwrap()
}

#[wasm_bindgen_test]
async fn delete_prefix() {
let db = open_db(st::NB_DELETE_PREFIX_TESTS, "delete_prefix").await;
st::test_delete_prefix(&db).unwrap()
}

#[wasm_bindgen_test]
async fn iter() {
let db = open_db(1, "iter").await;
Expand Down
37 changes: 37 additions & 0 deletions kvdb/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ pub struct DBTransaction {
pub enum DBOp {
Insert { col: u32, key: DBKey, value: DBValue },
Delete { col: u32, key: DBKey },
DeletePrefix { col: u32, prefix: DBKey },
}

impl DBOp {
Expand All @@ -43,6 +44,7 @@ impl DBOp {
match *self {
DBOp::Insert { ref key, .. } => key,
DBOp::Delete { ref key, .. } => key,
DBOp::DeletePrefix { ref prefix, .. } => prefix,
}
}

Expand All @@ -51,6 +53,7 @@ impl DBOp {
match *self {
DBOp::Insert { col, .. } => col,
DBOp::Delete { col, .. } => col,
DBOp::DeletePrefix { col, .. } => col,
}
}
}
Expand Down Expand Up @@ -80,6 +83,12 @@ impl DBTransaction {
pub fn delete(&mut self, col: u32, key: &[u8]) {
self.ops.push(DBOp::Delete { col, key: DBKey::from_slice(key) });
}

/// Delete value by key.
cheme marked this conversation as resolved.
Show resolved Hide resolved
pub fn delete_prefix(&mut self, col: u32, prefix: &[u8]) {
self.ops.push(DBOp::DeletePrefix { col, prefix: DBKey::from_slice(prefix) });
}

cheme marked this conversation as resolved.
Show resolved Hide resolved
}

/// Generic key-value database.
Expand All @@ -106,6 +115,17 @@ pub trait KeyValueDB: Sync + Send + parity_util_mem::MallocSizeOf {
/// Write a transaction of changes to the backing store.
fn write(&self, transaction: DBTransaction) -> io::Result<()>;

/// Delete all prefixed key (do not delete buffered values and
/// do not support atomic operation in a transaction).
ordian marked this conversation as resolved.
Show resolved Hide resolved
fn delete_prefix(&self, col: u32, prefix: &[u8]) -> io::Result<()> {
// default unefficient implementation.
let mut transaction = DBTransaction::new();
self.iter_from_prefix(col, prefix).for_each(|(key, _)| {
transaction.delete_prefix(col, &key[..]);
ordian marked this conversation as resolved.
Show resolved Hide resolved
});
self.write(transaction)
}

/// Iterate over the data for a given column.
fn iter<'a>(&'a self, col: u32) -> Box<dyn Iterator<Item = (Box<[u8]>, Box<[u8]>)> + 'a>;

Expand All @@ -129,3 +149,20 @@ pub trait KeyValueDB: Sync + Send + parity_util_mem::MallocSizeOf {
IoStats::empty()
}
}

/// Return for a start inclusive prefix, the non inclusive end.
cheme marked this conversation as resolved.
Show resolved Hide resolved
pub fn end_prefix(prefix: &[u8]) -> Vec<u8> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function can be simplified

fn end_prefix(mut prefix: Vec<u8>) -> Vec<u8> {
    while let Some(0xff) = prefix.last() {
        prefix.pop();
    }
    if let Some(byte) = prefix.last_mut() {
        *byte += 1;
    }
    prefix
}

I'm not sure we want to expose it kvdb, I see it's used as a utility function in kvdb-rocksdb and kvdb-memorydb, but for example, we don't expose other_io_error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if the prefix is empty or 0xfffffff...ff, will it delete the whole db?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we could reuse this function for iter_from_prefix by setting https://docs.rs/rocksdb/0.13.0/rocksdb/struct.ReadOptions.html#method.set_iterate_upper_bound

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for empty prefix it does indeed delete every values (not in a very efficient way).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about 0xff prefix? I think we should add some tests and document the difference between 0xff and 0xff00 as it's not intuitive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to document that, I will add a test it may make thing more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oups, I actually meant 0xff and 0x00ff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to document that,

Let's add a few examples and include interesting cases such as 0xff and 0xff00 among the examples?

let mut end_range = prefix.to_vec();
loop {
if end_range.len() == 0 {
return end_range;
}
let end_index = end_range.len() - 1;
if end_range[end_index] < u8::max_value() {
end_range[end_index] = end_range[end_index] + 1;
return end_range;
} else {
end_range.pop();
};
};
}