Transactions and Isolation Levels: ACID in Practice


ACID is one of those acronyms developers can recite (Atomicity, Consistency, Isolation, Durability) without being able to say what it means in practice. The part that actually matters for writing correct application code is Isolation - specifically, isolation levels. Get the isolation level wrong and you have subtle data corruption bugs that appear only under concurrent load.

What a Transaction Is

A transaction groups multiple database operations into a single unit. Either all of them succeed and are committed, or none of them are. This is Atomicity.

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

If the second UPDATE fails, the first is rolled back. The database never ends up in a state where money has left one account but not arrived in the other.

Durability means that once a transaction is committed, it stays committed even if the server crashes immediately after. The database writes to durable storage (with write-ahead logging) before acknowledging the commit.

Consistency means the database remains in a valid state as defined by its constraints - foreign keys, unique constraints, check constraints. Transactions that would violate constraints are rejected.

These three are relatively straightforward. Isolation is where the nuance lives.

The Isolation Problem

Isolation concerns what happens when multiple transactions run simultaneously. A fully isolated system would make concurrent transactions appear to run one at a time. This is expensive to implement efficiently. In practice, databases offer different levels of isolation, each with different performance characteristics and different anomalies they may permit.

The anomalies, from mildest to most severe:

Dirty read: reading uncommitted data from another transaction. Transaction A modifies a row but hasn’t committed. Transaction B reads that row and sees A’s uncommitted change. Then A rolls back. B has read data that never officially existed.

Non-repeatable read: reading the same row twice in a transaction and getting different values because another transaction committed a change between the two reads.

Phantom read: re-running a query in a transaction and getting different rows because another transaction inserted or deleted rows that match the query’s WHERE clause.

Lost update: two transactions read a value, both modify it, both write back. The second write overwrites the first. Neither transaction saw the other’s update.

The Four Isolation Levels

The SQL standard defines four levels, each preventing a different set of anomalies:

READ UNCOMMITTED - allows dirty reads. Rarely used in practice. You can see uncommitted changes from other transactions.

READ COMMITTED - prevents dirty reads. You only see data that has been committed. However, if you read the same row twice within a transaction, you might get different values (non-repeatable reads) because another transaction committed between your reads. This is the default in PostgreSQL and Oracle.

REPEATABLE READ - prevents dirty reads and non-repeatable reads. If you read a row twice in a transaction, you get the same value both times. However, phantom reads are possible in the SQL standard definition (though PostgreSQL’s implementation actually prevents them). This is the default in MySQL’s InnoDB.

SERIALIZABLE - the strictest level. Transactions behave as if they ran serially, one at a time. All anomalies are prevented. This is implemented with either locking (which can cause deadlocks) or snapshot isolation with serialization conflict detection.

What This Means in Practice

READ COMMITTED is fine for most read operations where you’re not relying on consistency across multiple reads. Looking up a user’s profile, fetching product details - these don’t usually require that the data stays consistent for the duration of the request.

Where READ COMMITTED causes problems: any operation that reads, modifies, and writes back based on the read value.

-- Transaction A and B both run concurrently
-- Transaction A:
SELECT balance FROM accounts WHERE id = 1;  -- reads 100
-- Transaction B concurrently:
UPDATE accounts SET balance = balance + 50 WHERE id = 1;
COMMIT;  -- balance is now 150
-- Transaction A continues:
UPDATE accounts SET balance = balance - 30 WHERE id = 1;
COMMIT;  -- sets to 70, not 120! Lost B's update.

This is the lost update problem. The fix is either REPEATABLE READ (which detects the conflict) or a SELECT FOR UPDATE which takes a row lock:

BEGIN;
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;  -- locks the row
-- Now B's UPDATE will wait until we commit
UPDATE accounts SET balance = balance - 30 WHERE id = 1;
COMMIT;

SERIALIZABLE is the right choice when correctness matters more than performance, especially for financial operations, inventory management, or any operation with complex invariants. Modern implementations (Serializable Snapshot Isolation in PostgreSQL) can achieve this with much less blocking than traditional locking approaches.

Checking and Setting Isolation Level

In PostgreSQL:

-- Check current default
SHOW default_transaction_isolation;

-- Set for a specific transaction
BEGIN ISOLATION LEVEL REPEATABLE READ;

-- Set for the session
SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Most ORMs let you specify isolation level when starting a transaction:

# SQLAlchemy
with session.begin():
    session.execute(text("SET TRANSACTION ISOLATION LEVEL REPEATABLE READ"))
    # ... your operations

Deadlocks

Higher isolation levels increase the chance of deadlocks - situations where two transactions are each waiting for the other to release a lock. The database detects these and rolls back one of the transactions.

The standard response to a deadlock error is to retry the transaction. This needs to be handled in your application code.

The main technique to prevent deadlocks: access resources in a consistent order across transactions. If transaction A always locks account 1 before account 2, and transaction B does the same, they can’t deadlock on each other.

The Practical Default

For most application code, READ COMMITTED is fine. When you have an operation that requires reading consistent data and making decisions based on it, use REPEATABLE READ or explicit row locking (SELECT FOR UPDATE). For critical financial or inventory operations where phantom reads or write skew could cause corruption, use SERIALIZABLE.

The instinct to always use the highest isolation level “to be safe” is understandable but often wrong - SERIALIZABLE has meaningful performance cost and higher abort/retry rates. Match the isolation level to the actual consistency requirements of each operation.



Read more