More advanced connection pooling #569

simonw · 2019-08-17T13:20:41Z

We need a much smarter way of handling database connections.

Today, connections are simple: Datasette runs a number of threads (defaults to 3) and each thread gets a threadlocal read-only (or immutable) connection to each attached database - opened on demand.

For Datasette Library (#417) I want to support potentially hundreds of attached databases. Datasette Edit (#567) is going to introduce a need for writable connections too.

I'd also like to be able to run joins across multiple databases (#283) which further complicates things.

Supporting thousands of open SQLite connections at once feels like it won't provide good enough performance (though I should benchmark that to be sure). Some kind of connection pooling is likely to be necessary.

simonw · 2019-08-17T13:24:50Z

I think what I want it a mechanism where a thread can say "give me a connection for database X" and it either gets back connection to X instantly OR a new connection to X is created and returned OR it blocks (because a certain number of connections to that database exist already) until another thread returns their connection OR it times out and returns an error.

simonw · 2019-08-17T13:44:24Z

Potential API design:

with pool.connection("fixtures") as conn:
    conn.set_time_limit(1000)
    conn.allow_all()
    conn.execute(...)

Within this block the current thread has exclusive action to the connection - which has essentially been "checked out" from the pool. When the block ends that connection is made available to be checked out by other threads.

This could accept multiple database names for the case where I want to join across databases:

with pool.connection("fixtures", "dogs") as conn:
    conn.execute(...)

simonw · 2019-10-02T22:41:12Z

I'm going to refactor the execute() and execute_against_connection_in_thread() methods.

They currently live on the Datasette class, but in this new world it would make more sense for them to live on the Database, ConnectionGroup or Connection classes.

I think I'll put them on the Database class.

simonw · 2019-10-02T22:44:36Z

I'm going to simplify things a bunch by continuing to ignore the cross-database joining issue #283 - I'll post some notes there on my latest thinking.

Also moved Pool stuff into the datasette/database.py module Refs #569

Refs #569 - I split this change out from #579

Refs #569

simonw added a commit that referenced this issue Oct 2, 2019

First working sketch of connection pool, refs #569

57ee603

simonw mentioned this issue Oct 2, 2019

Support cross-database joins #283

Closed

simonw added a commit that referenced this issue Oct 2, 2019

Refactor .execute() onto Database class

8d00d24

Also moved Pool stuff into the datasette/database.py module Refs #569

simonw mentioned this issue Oct 2, 2019

New connection pooling #579

Open

simonw added a commit that referenced this issue Nov 15, 2019

Move .execute() from Datasette to Database

a9909c2

Refs #569 - I split this change out from #579

simonw added a commit that referenced this issue Nov 15, 2019

New Pool/ConnectionGroup implementation

32cbfd2

Refs #569

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More advanced connection pooling #569

More advanced connection pooling #569

simonw commented Aug 17, 2019

simonw commented Aug 17, 2019

simonw commented Aug 17, 2019 •

edited

Loading

simonw commented Oct 2, 2019

simonw commented Oct 2, 2019

More advanced connection pooling #569

More advanced connection pooling #569

Comments

simonw commented Aug 17, 2019

simonw commented Aug 17, 2019

simonw commented Aug 17, 2019 • edited Loading

simonw commented Oct 2, 2019

simonw commented Oct 2, 2019

simonw commented Aug 17, 2019 •

edited

Loading