Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datasette one.db one.db opens database twice, as one and one_2 #1632

Closed
simonw opened this issue Feb 7, 2022 · 6 comments
Closed

datasette one.db one.db opens database twice, as one and one_2 #1632

simonw opened this issue Feb 7, 2022 · 6 comments
Labels
Milestone

Comments

@simonw
Copy link
Owner

simonw commented Feb 7, 2022

% mkdir /tmp/data
% cp ~/Dropbox/Development/datasette/fixtures.db /tmp/data 
% datasette /tmp/data/*.db /tmp/data/created.db --create -p 8852
...
INFO:     Uvicorn running on http://127.0.0.1:8852 (Press CTRL+C to quit)
^CINFO:     Shutting down
% datasette /tmp/data/*.db /tmp/data/created.db --create -p 8852
...
INFO:     127.0.0.1:49533 - "GET / HTTP/1.1" 200 OK

The first time I ran Datasette I got two databases - fixtures and created

BUT... when I ran Datasette the second time it looked like this:

image

This is the same result you get if you run:

datasette /tmp/data/fixtures.db /tmp/data/created.db /tmp/data/created.db

This is caused by this Datasette issue:

So... either I teach Datasette to de-duplicate multiple identical file paths passed to the command, or I can't use /data/*.db in the Dockerfile here and I need to go back to other solutions for the challenge described in this comment: simonw/datasette-publish-fly#12 (comment)

Originally posted by @simonw in simonw/datasette-publish-fly#12 (comment)

@simonw simonw added the bug label Feb 7, 2022
@simonw
Copy link
Owner Author

simonw commented Feb 7, 2022

I found this bug while trying to get the following to work:

datasette /data/one.db /data/two.db /data/*.db --create

I want this to create any missing database files on startup out of that literal list of one.db and two.db and to also open any other *.db files in that folder - needed for datasette-publish-fly in simonw/datasette-publish-fly#12 (comment)

@simonw
Copy link
Owner Author

simonw commented Feb 7, 2022

I'm going to fix this in a 0.60.2 bug fix release.

@simonw
Copy link
Owner Author

simonw commented Feb 7, 2022

I'm going to fix this in the CLI code itself, rather than fixing it in the Datasette constructor. That way if someone has a truly weird reason to want this behaviour they can construct Datasette directly.

datasette/datasette/cli.py

Lines 535 to 550 in 03305ea

# if files is a single directory, use that as config_dir=
if 1 == len(files) and os.path.isdir(files[0]):
kwargs["config_dir"] = pathlib.Path(files[0])
files = []
# Verify list of files, create if needed (and --create)
for file in files:
if not pathlib.Path(file).exists():
if create:
sqlite3.connect(file).execute("vacuum")
else:
raise click.ClickException(
"Invalid value for '[FILES]...': Path '{}' does not exist.".format(
file
)
)

@simonw
Copy link
Owner Author

simonw commented Feb 7, 2022

For the record, here's the code that picks the one_2 name if that stem is already used as a database name:

datasette/datasette/app.py

Lines 401 to 417 in 03305ea

def add_database(self, db, name=None):
new_databases = self.databases.copy()
if name is None:
# Pick a unique name for this database
suggestion = db.suggest_name()
name = suggestion
else:
suggestion = name
i = 2
while name in self.databases:
name = "{}_{}".format(suggestion, i)
i += 1
db.name = name
new_databases[name] = db
# don't mutate! that causes race conditions with live import
self.databases = new_databases
return db

@simonw simonw closed this as completed in 0cd982f Feb 7, 2022
simonw added a commit that referenced this issue Feb 7, 2022
simonw added a commit that referenced this issue Feb 7, 2022
@simonw
Copy link
Owner Author

simonw commented Feb 7, 2022

That implementation broke on Python 3.6 - which is still a supported Python version for the 0.60.x branch - test_homepage failed.

>       assert (
            "2 rows in 1 table, 5 rows in 4 hidden tables, 1 view" == counts_p.text.strip()
        )
E       AssertionError: assert '2 rows in 1 ...ables, 1 view' == '1 table, 4 h...ables, 1 view'
E         - 1 table, 4 hidden tables, 1 view
E         + 2 rows in 1 table, 5 rows in 4 hidden tables, 1 view
E         ? ++++++++++         ++++++++++

That's because this idiom isn't guaranteed to preserve order in versions earlier than Python 3.7:

datasette/datasette/cli.py

Lines 552 to 553 in fa5fc32

# De-duplicate files so 'datasette db.db db.db' only attaches one /db
files = list(dict.fromkeys(files))

I could say that 0.60.2 is the first version to require Python 3.7 - but that feels a little surprising.

I'm going to use a different idiom for order-preserving de-duplication from this StackOverflow instead.

@simonw simonw reopened this Feb 7, 2022
simonw added a commit that referenced this issue Feb 7, 2022
This reverts commit 5bfd001.

No need for this on the main branch because it doesn't support Python 3.6 any more.
@simonw
Copy link
Owner Author

simonw commented Feb 7, 2022

@simonw simonw closed this as completed Feb 7, 2022
simonw added a commit to simonw/datasette-publish-fly that referenced this issue Feb 7, 2022
simonw added a commit to simonw/datasette-publish-fly that referenced this issue Feb 11, 2022
@simonw simonw added this to the Datasette 1.0 milestone Mar 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant