@rkenmi - PostgreSQL - a powerhouse relational database

PostgreSQL - a powerhouse relational database


PostgreSQL - a powerhouse relational database


Back to Top

Updated on July 11, 2025

PostgreSQL is one of the most popular SQL database engines and is widely used in the tech industry. It is a great starting point for building out an application that needs fast reads and writes with an emphasis on reads, and guarantees ACID properties through atomic transactions with various concurrency isolation levels. Let's begin!

Overview of Benefits

PostgreSQL is a powerhouse solution on a single machine. On a single node, PostgreSQL delivers excellent read and write performance, even before custom tweaking of the database engine. It is also incredibly vertically scalable - adding more memory, cpu cores, and disk space gives it a significant boost in capabilities.

On average commodity hardware, we can expect the ballpark performance to be as follows:

Read Performance

  • Index reads can range from 10,000 TPS - 20,000 TPS per CPU core
  • Complex join queries: Around 1,000 - 2,000 TPS
  • Full table scan: This is naturally slow, but is especially egregious if the records cannot fit into memory

Write Performance

  • Single-table INSERT INTO: ~5,000 TPS per CPU core
  • Single-table UPDATE (plus index updates): ~1,000 TPS per CPU core
  • Multi-table/index writes from complex transactions: ~100 TPS
  • Bulk operations: ~10,000 TPS

Bottlenecks

PostgreSQL performance on a single node will degrade if any of the following is true:

  • Size of all data cannot be contained in memory
  • Disk space being exceeded
  • Disk I/O
    • The write performance above is bounded by disk I/O constraints while writing to the WAL
  • Complex joins or updates to multiple tables/indexes are always going to be a bottleneck
  • If records exceed 10 million:
    • Queries and joins start to stagnate
    • Full text-search also starts to degrade

ACID Guarantees

PostgreSQL is widely known for its ACID properties - Atomicity, Consistency, Isolation, and Durability. ACID generally applies to transactions in PostgreSQL, which allow a group of actions to succeed before committing to the database. This is an important distinction vs. queries (reads) in PostgreSQL since writes will modify the state of the database and its records.

  • Atomicity means that operations can be rolled back in the event of failures. This is important if you have a banking application and you want to send your friend money - you must ensure that funds are deposited from an account first before your friend receives them! If errors occur here, we DO want to rollback and correct it (or try again).
  • Consistency means that transactions bring the state of the database from one valid state to another valid state. To be more specific, that means that if there are rules associated to database tables, then transactions that violate those rules will not be committed, and the database state remains unchanged.

For example, on the table below, if transactions deduct bananas to below 0, the DECIMAL CHECK rule will be respected, ensuring that the database state doesn't fall into an invalid state.

CREATE TABLE basket(  
    id TEXT PRIMARY KEY,
    bananas DECIMAL CHECK (bananas >= 0),
    oranges DECIMAL CHECK (oranges >= 0)
);

Insertion with improper values causes the database state to rollback to a valid state:

postgres=# INSERT INTO basket (id, bananas, oranges) VALUES (1, 1, -1);  
ERROR:  new row for relation "basket" violates check constraint "basket_oranges_check"  
DETAIL:  Failing row contains (1, 1, -1).  

Trying to update a valid row with improper values also does the same thing:

postgres=# INSERT INTO basket (id, bananas, oranges) VALUES (1, 1, 3);  
postgres=# UPDATE basket SET bananas = bananas - 2 WHERE id = '1';  
ERROR:  new row for relation "basket" violates check constraint "basket_bananas_check"  
DETAIL:  Failing row contains (1, -1, 3).  

This is not the same as Consistency in CAP Theorem, where the CAP Theorem definition talks about a consistent state of data across replicas.

  • Isolation refers to the transaction isolation level when concurrent transactions happen. This is important, because by default, a transaction doing a lookup (query) on a table will always fetch the latest values. This can lead to an inconsistency in the read view among concurrent transactions. PostgreSQL offers several transaction isolation levels to customize this. For example, you can set the isolation level to SERIALIZABLE if you want strong read consistency guarantees in the queries. The downside of this is that if another transaction concurrently tries to query the table, the transaction will fail and rollback, meaning that the application layer must catch this error and handle this scenario appropriately.

  • Durability means that in the scenario of a system failure, the written results will be preserved. PostgreSQL uses WAL (Write-ahead logs) and primarily persists data into disk, so recovery is fairly straightforward.

Transactions

PostgreSQL allows you to define one atomic block of operations. To demonstrate this, we'll first create an example banking table with two values: account_id and balance.

postgres=# CREATE TABLE accounts (  
    account_id TEXT PRIMARY KEY,
    balance DECIMAL CHECK (balance >= 0)
);

Now we will populate some rows:

INSERT INTO accounts (account_id, balance) values ('foo', 301.25);  
INSERT INTO accounts (account_id, balance) values ('ben', 120.00);  
INSERT INTO accounts (account_id, balance) values ('jane', 191.05);  

Resulting rows:

 account_id | balance 
------------+---------
 foo        |  301.25
 ben        |  120.00
 jane       |  191.05

Declaring a Transaction

We can declare the transaction using the BEGIN and COMMIT keywords.

Suppose that we want to have a transaction block where we transfer $100 from Ben's account to Jane's.
It could look something like this:

BEGIN;

UPDATE accounts  
SET balance = balance - 100  
WHERE account_id = 'ben';

UPDATE accounts  
SET balance = balance + 100  
WHERE account_id = 'jane';

COMMIT;  

And the result:

 account_id | balance 
------------+---------
 foo        |  301.25
 ben        |   20.00
 jane       |  291.05

The transaction block here guarantees that both update operations must succeed before committing. Failure to do so results in a rollback, guaranteeing atomicity for this operation.

Transaction Isolation Level

Suppose that Ben has a starting balance of 100.

UPDATE accounts  
SET balance = 100  
WHERE account_id = 'ben';  

Then we create a transaction deducting $100 from Ben's account.

BEGIN;

UPDATE accounts  
SET balance = balance - 100  
WHERE account_id = 'ben';  

As this transaction runs, what if another transaction depletes the entire balance of Ben and runs just right after the BEGIN statement and during UPDATE operation? In this scenario, it isn't clear what the balance would be before the update. Is it 100 or is it 0?

By default, if another transaction changes rows concurrently before a different transaction performs a query, it will always read the latest (changed) rows. This is the non-repeatable read phenomenon, since repeated queries can return different results in a transaction. The default READ_COMMITTED isolation level allows for this behavior to exist.

A few important concurrency anomalies to cover:

  • Dirty Reads
    • This anomaly means that a transaction may see uncommitted changes from another transaction.
    • This isn't usually an issue by default because the READ_COMMITTED default isolation level ensures that we only see committed data from transactions only.
  • Non-Repeatable Reads
    • This anomaly, as described above, allows repeated queries in a transaction to return different results, leading to possible inconsistency issues
    • To fix this, we want our transaction isolation level to be REPEATABLE_READ
  • Phantom Reads
    • This anomaly allows repeated queries to return new or deleted rows which result from UPDATE operations in concurrent transactions.
    • The nuance here vs. repeatable reads is that even if REPEATABLE_READ is set as the isolation level, it can still include new rows that have been added or less rows (due to deletion).
    • To eliminate phantom reads, you want the SERIALIZABLE mode.

As you go from READ_COMMITTED to REPEATABLE_READ and SERIALIZABLE, you have more data consistency guarantees. On the flip side, you will notice slower performance since transactions will now have to wait for other transactions to finish. Subsequently, new errors may arise that has to be caught at the application layer; for example, you might see errors only specific to SERIALIZABLE vs. the default READ_COMMITTED since the database can now throw errors when a deadlock occurs, or there is some unknown transaction dependency error.

Geospatial Queries (PostGIS)

Partitioning

PostgreSQL allows partitioning of data, which improves:

  • Data Locality; all data for a single partition will be in the same machine
  • Allows better data distribution among read replicas (replicas can be configured to replicate specific partitions)

We can create a table that is partitioned by region_id:

CREATE TABLE sales_data (  
    sale_id SERIAL,
    region_id INT NOT NULL,
    product_name VARCHAR(100),
    sale_amount DECIMAL(10, 2),
    sale_date DATE
) PARTITION BY LIST (region_id);

We can auto-create partitions with the following command:

CREATE TABLE sales_data_default PARTITION OF sales_data DEFAULT;  

And finally, we can insert records as such:

INSERT INTO sales_data (region_id, product_name, sale_amount, sale_date) VALUES  
(1, 'Laptop', 1200.00, '2025-07-01'),
(2, 'Mouse', 25.00, '2025-07-02'),
(1, 'Keyboard', 75.00, '2025-07-03'),
(3, 'Monitor', 300.00, '2025-07-04'),
(99, 'Webcam', 50.00, '2025-07-05'); -- This will go to the default partition

Article Tags:
unlistedsqlpostgres