GGistDev

Joins

Combine rows across tables with relational joins.

Inner join

SELECT o.id, c.name, o.total
FROM orders o
JOIN customers c ON c.id = o.customer_id;

Keeps only rows with matching keys.

Left/Right/Full outer joins

-- Left: keep all left rows; right side may be NULL
SELECT c.id, c.name, o.id AS order_id
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id;

-- Full (if supported): keep rows from both sides, fill with NULLs
SELECT *
FROM a FULL OUTER JOIN b ON a.k = b.k;

USING vs ON

SELECT * FROM orders JOIN customers USING (id); -- same-named key

USING merges same-named columns; ON is explicit and more flexible.

Cross join

SELECT * FROM a CROSS JOIN b; -- Cartesian product

Use with care.

Anti-join and semi-join

-- Anti-join: rows in c without orders
SELECT c.*
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.id IS NULL;

-- Semi-join: rows in c with at least one order
SELECT c.* FROM customers c WHERE EXISTS (
  SELECT 1 FROM orders o WHERE o.customer_id = c.id
);

Outer join filters: ON vs WHERE

Filters in WHERE run after the join; for outer joins, put side-specific filters in ON to avoid unintentionally removing NULL-extended rows.

-- Keep customers without orders and only paid orders when present
SELECT c.*, o.*
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.id AND o.status = 'paid';

Pitfalls

  • Missing ON condition yields a huge Cartesian product
  • Ambiguous column names; qualify with table aliases
  • Duplicates from one-to-many relationships; aggregate or DISTINCT if needed

Summary

  • Choose the join type by row-retention semantics
  • Be deliberate about ON vs WHERE, and use EXISTS/anti-joins for presence checks