Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: counters for rows read and written #979

Merged

Conversation

smerritt
Copy link
Contributor

@smerritt smerritt commented Aug 4, 2023

In a durable object, state.storage.sql now has two new properties: rowsRead and rowsWritten. These count the number of rows read and written by the underlying SQLite database.

This is more than just the number of rows returned from queries or the number of records inserted/updated; it counts all rows read by the database engine in the processing of a query. For example, SELECT MAX(unindexed_column) FROM a_table returns a single result but reads every row from disk, and rowsRead reflects that.

@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from 217fadf to 91b9597 Compare August 4, 2023 23:09
@smerritt smerritt marked this pull request as ready for review August 4, 2023 23:20
@kentonv
Copy link
Member

kentonv commented Aug 5, 2023

Does this patch come directly from libSQL? If so we need to include license and copyright information, probably in the description header for each patch file. (Looks like it's MIT license, so no issues using it, but still need to include the copyright.)

src/workerd/util/sqlite.h Outdated Show resolved Hide resolved
src/workerd/util/sqlite.h Outdated Show resolved Hide resolved
src/workerd/util/sqlite.c++ Outdated Show resolved Hide resolved
src/workerd/util/sqlite.c++ Outdated Show resolved Hide resolved
src/workerd/api/sql.h Outdated Show resolved Hide resolved
src/workerd/util/sqlite.c++ Outdated Show resolved Hide resolved
@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch 2 times, most recently from 381b2b6 to eb97f7c Compare August 7, 2023 23:52
@smerritt
Copy link
Contributor Author

smerritt commented Aug 7, 2023

The patch now contains an appropriate copyright header pointing at libsql.

@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from eb97f7c to 085518a Compare August 7, 2023 23:56
src/workerd/util/sqlite.h Outdated Show resolved Hide resolved
@@ -167,8 +167,12 @@ auto SqlStorage::Cursor::iteratorImpl(jsg::Lock& js, jsg::Ref<Cursor>& obj, Func
}

auto& query = state.query;

obj->rowsRead = query.getRowsRead();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it seems wasteful to query these counters on every single row even if the app doesn't ever use them. (And even apps that do use them probably wouldn't check them on every row.)

Maybe we could have the implementation of getRowsRead() delegate to the underlying Query if it is still active. Then we only need to make a copy when the query is done.

Or maybe this isn't expensive enough to matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're cheap enough that this probably doesn't matter, but I agree that it looks wasteful, which is a trap for future programmers to fall into.

I took your suggestion of saving them off when the Query is done. I didn't see a way of invalidating a query out from under a Cursor, but if you know of one, then please tell me because that'll make the cursor falsely report 0 for row counts.

@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from 085518a to cf38956 Compare August 9, 2023 23:16
This takes the row-counting logic from libsql[1] and sticks it into
our copy of SQLite.

libsql is an ambitious fork of SQLite, so switching out the entire
library would be quite risky. Porting the row-counting logic from
libsql (which is MIT-licensed) is a much less risky change, though it
does require us to carry a patch around forever, or at least until
SQLite implements something similar, if they ever do.

[1] A fork of sqlite that accepts contributions.
    https://github.com/libsql/libsql
The `storage.sql` object in JS now has two new properties: `rowsRead`
and `rowsWritten`, which contain appropriate numbers. Similarly,
workerd::api::SqliteDatabase::getRows[Read|Written] provide the same
numbers in C++.

The counters are only updated after a full result set has been read. I
thought about updating the counters as each row was read, but that has
the disadvantage of counting rows that weren't useful due to errors.
If one wanted to meter usage based on rows, one would only want to
count rows for successfully-executed queries.
@smerritt smerritt force-pushed the smerritt/sqlite-row-counts-amalgamated branch from cf38956 to 26f0498 Compare August 10, 2023 16:37
@smerritt smerritt merged commit bb06753 into cloudflare:main Aug 10, 2023
@smerritt smerritt deleted the smerritt/sqlite-row-counts-amalgamated branch August 10, 2023 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants