Normalization Basics

Design schemas that reduce redundancy and anomalies, while balancing performance needs.

1NF (First Normal Form)

Atomic column values, no repeating groups or arrays in a single column
Separate repeating data into child tables

2NF (Second Normal Form)

For tables with composite keys, every non-key attribute depends on the whole key
Avoid partial dependencies by splitting tables

3NF (Third Normal Form)

No transitive dependencies: non-key attributes should depend only on the key
Move derived attributes to their own tables or compute on read

Example

-- Orders (normalized)
CREATE TABLE customers (
  id BIGSERIAL PRIMARY KEY,
  name TEXT NOT NULL
);
CREATE TABLE products (
  id BIGSERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  price NUMERIC(10,2) NOT NULL
);
CREATE TABLE orders (
  id BIGSERIAL PRIMARY KEY,
  customer_id BIGINT NOT NULL REFERENCES customers(id),
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE order_items (
  order_id BIGINT REFERENCES orders(id),
  product_id BIGINT REFERENCES products(id),
  qty INT NOT NULL,
  PRIMARY KEY (order_id, product_id)
);

Denormalization tradeoffs

Denormalize for read performance (caches, summary tables, materialized views)
Maintain via triggers/jobs; ensure source of truth is clear

Practical tips

Use surrogate keys (BIGINT) and natural keys with UNIQUE constraints
Index foreign keys and common access patterns
Document invariants that the schema enforces (and those it doesn’t)

Summary

Aim for 3NF baseline; denormalize thoughtfully for performance
Keep integrity with constraints and clear ownership of derived data