Normalization Basics
Design schemas that reduce redundancy and anomalies, while balancing performance needs.
1NF (First Normal Form)
- Atomic column values, no repeating groups or arrays in a single column
- Separate repeating data into child tables
2NF (Second Normal Form)
- For tables with composite keys, every non-key attribute depends on the whole key
- Avoid partial dependencies by splitting tables
3NF (Third Normal Form)
- No transitive dependencies: non-key attributes should depend only on the key
- Move derived attributes to their own tables or compute on read
Example
-- Orders (normalized)
CREATE TABLE customers (
id BIGSERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE products (
id BIGSERIAL PRIMARY KEY,
name TEXT NOT NULL,
price NUMERIC(10,2) NOT NULL
);
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY,
customer_id BIGINT NOT NULL REFERENCES customers(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE order_items (
order_id BIGINT REFERENCES orders(id),
product_id BIGINT REFERENCES products(id),
qty INT NOT NULL,
PRIMARY KEY (order_id, product_id)
);
Denormalization tradeoffs
- Denormalize for read performance (caches, summary tables, materialized views)
- Maintain via triggers/jobs; ensure source of truth is clear
Practical tips
- Use surrogate keys (BIGINT) and natural keys with UNIQUE constraints
- Index foreign keys and common access patterns
- Document invariants that the schema enforces (and those it doesn’t)
Summary
- Aim for 3NF baseline; denormalize thoughtfully for performance
- Keep integrity with constraints and clear ownership of derived data