GGistDev

Subqueries and CTEs

Compose complex queries with nested SELECTs and common table expressions.

Scalar and table subqueries

-- Scalar
SELECT id, (SELECT MAX(total) FROM orders WHERE customer_id = c.id) AS max_total
FROM customers c;

-- Table
SELECT * FROM (
  SELECT id, name FROM users WHERE active
) AS u
WHERE name LIKE 'A%';

EXISTS and IN

SELECT * FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders o WHERE o.customer_id = c.id
);

SELECT * FROM customers WHERE id IN (SELECT customer_id FROM orders);

EXISTS often performs better with correlated checks.

Correlated subqueries

SELECT c.id,
       (
         SELECT SUM(total)
         FROM orders o
         WHERE o.customer_id = c.id
       ) AS spend
FROM customers c;

CTEs (WITH)

WITH recent_orders AS (
  SELECT * FROM orders WHERE created_at >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT customer_id, COUNT(*)
FROM recent_orders
GROUP BY customer_id;

CTEs improve readability. Some engines inline/optimize them; others may materialize.

Recursive CTEs (example)

WITH RECURSIVE ancestors AS (
  SELECT id, parent_id FROM nodes WHERE id = 42
  UNION ALL
  SELECT n.id, n.parent_id
  FROM nodes n
  JOIN ancestors a ON n.id = a.parent_id
)
SELECT * FROM ancestors;

Notes on optimization

  • Prefer EXISTS for presence checks; avoid SELECT * in subqueries
  • Watch for CTE materialization (e.g., older Postgres versions)

Summary

  • Use subqueries for local computations; CTEs for readability and reuse
  • Correlate carefully; consider performance trade-offs