Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Commit

Permalink
Update UPSERT comment now that native upserts are the default
Browse files Browse the repository at this point in the history
  • Loading branch information
David Robertson committed Sep 27, 2022
1 parent ac1b0d0 commit 8d486b3
Showing 1 changed file with 41 additions and 10 deletions.
51 changes: 41 additions & 10 deletions synapse/storage/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -1141,17 +1141,48 @@ async def simple_upsert(
desc: str = "simple_upsert",
lock: bool = True,
) -> bool:
"""
"""Insert a row with values + insertion_values; on conflict, update with values.
All of our supported databases accept the nonstandard "upsert" statement in
their dialect of SQL. We call this a "native upsert". The syntax looks roughly
like:
INSERT INTO table VALUES (values + insertion_values)
ON CONFLICT (keyvalues)
DO UPDATE SET (values); -- overwrite `values` columns only
If (values) is empty, the resulting query is slighlty simpler:
INSERT INTO table VALUES (insertion_values)
ON CONFLICT (keyvalues)
DO NOTHING; -- do not overwrite any columns
This function is a helper to build such queries.
In order for upserts to make sense, the database must be able to determine when
an upsert CONFLICTs with an existing row. Postgres and SQLite ensure this by
requiring that a unique index exist on the column names used to detect a
conflict (i.e. `keyvalues.keys()`).
If there is no such index, we can "emulate" an upsert with a SELECT followed
by either an INSERT or an UPDATE. This is unsafe: we cannot make the same
atomicity guarantees that a native upsert can and are very vulnerable to races
and crashes. Therefore if we wish to upsert without an appropriate unique index,
we must either:
1. Acquire a table-level lock before the emulated upsert (`lock=True`), or
2. VERY CAREFULLY ensure that we are the only thread and worker which will be
writing to this table, in which case we can proceed without a lock
(`lock=False`).
Generally speaking, you should use `lock=True`. If the table in question has a
unique index[*], this class will use a native upsert (which is atomic and so can
ignore the `lock` argument). Otherwise this class will use an emulated upsert,
in which case we want the safer option unless we been VERY CAREFUL.
`lock` should generally be set to True (the default), but can be set
to False if either of the following are true:
1. there is a UNIQUE INDEX on the key columns. In this case a conflict
will cause an IntegrityError in which case this function will retry
the update.
2. we somehow know that we are the only thread which will be updating
this table.
As an additional note, this parameter only matters for old SQLite versions
because we will use native upserts otherwise.
[*]: This class is aware that some tables have unique indices added as
background updates. It checks at runtime to see if those updates are
pending: if not, they are deemed safe for use with native upserts.
Args:
table: The table to upsert into
Expand Down

0 comments on commit 8d486b3

Please sign in to comment.