Compressing Postgres Data?
Under the hood, Postgres does some data compression for anything exceeding a certain size threshold. Learn about the Postgres Page and TOAST.
Table of Contents 📖
TOAST
Postgres has a built-in feature called TOAST (The Oversized-Attribute Storage Technique), that automatically compresses large data types (TEXT, JSONB, etc.) when they exceed a certain size threshold. This is done automatically by Postgres. As a demonstration, consider the following SQL. This inserts around 1 million characters into a row. This is a very large string so Postgres will automatically compress the data.
WARNING: The data types subjected to TOAST are primarily variable-length ones like TEXT.
CREATE TABLE example_table (
id SERIAL PRIMARY KEY,
large_text_column TEXT
);
-- Insert a large text string
INSERT INTO example_table (large_text_column)
VALUES (repeat('a', 1000000)); -- 1 million characters
Postgres Pages
This size limit comes from the Postgres page, a disk storage mechanism that is limited to 8kb in size. When data does not fit into a page, TOAST kicks in. TOAST will first compress the data, but if the data is still too large then it will be broken up. In other words, Postgres will carry out compression first and out-of-line storage next, if compression is not enough. We can see how large this data is by inspecting the size of the table.
SELECT pg_size_pretty(pg_total_relation_size('example_table'));
Database Overgrowth
Sadly, TOAST is more so for managing large values and less for preventing database overgrowth. In other words, TOAST won't help decrease the size of your database from 20TB to 5TB. It also carries some overhead. To prevent a database from overgrowing, I would recommend you look into partitioning, serverless solutions, vertical scaling, and more.