GGistDev

Normalization Basics

Design schemas that reduce redundancy and anomalies, while balancing performance needs.

1NF (First Normal Form)

  • Atomic column values, no repeating groups or arrays in a single column
  • Separate repeating data into child tables

2NF (Second Normal Form)

  • For tables with composite keys, every non-key attribute depends on the whole key
  • Avoid partial dependencies by splitting tables

3NF (Third Normal Form)

  • No transitive dependencies: non-key attributes should depend only on the key
  • Move derived attributes to their own tables or compute on read

Example

-- Orders (normalized)
CREATE TABLE customers (
  id BIGSERIAL PRIMARY KEY,
  name TEXT NOT NULL
);
CREATE TABLE products (
  id BIGSERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  price NUMERIC(10,2) NOT NULL
);
CREATE TABLE orders (
  id BIGSERIAL PRIMARY KEY,
  customer_id BIGINT NOT NULL REFERENCES customers(id),
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE order_items (
  order_id BIGINT REFERENCES orders(id),
  product_id BIGINT REFERENCES products(id),
  qty INT NOT NULL,
  PRIMARY KEY (order_id, product_id)
);

Denormalization tradeoffs

  • Denormalize for read performance (caches, summary tables, materialized views)
  • Maintain via triggers/jobs; ensure source of truth is clear

Practical tips

  • Use surrogate keys (BIGINT) and natural keys with UNIQUE constraints
  • Index foreign keys and common access patterns
  • Document invariants that the schema enforces (and those it doesn’t)

Summary

  • Aim for 3NF baseline; denormalize thoughtfully for performance
  • Keep integrity with constraints and clear ownership of derived data