GGistDev

Pipes and Filters

Compose powerful one-liners by chaining simple tools.

Pipelines

cat access.log | grep "/api" | cut -d' ' -f1 | sort | uniq -c | sort -nr | head

Prefer direct input (avoid useless cat):

grep "/api" access.log | cut -d' ' -f1 | sort | uniq -c | sort -nr | head

Common filters

  • grep: search lines (use -E for extended regex, -i ignore case)
  • sed: stream edits (s/old/new/g, deletion, insertion)
  • awk: field-aware processing (-F to set delimiter)
  • sort/uniq: order and count; use uniq -c, ensure sort first
  • cut/paste: select columns or merge by columns
  • tr: translate/delete characters (e.g., tr -d '\r')

Grep examples

grep -E "ERROR|WARN" app.log
grep -R "TODO" src/

Awk examples

awk -F, '{ sum += $3 } END { print sum }' data.csv
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head

Sed examples

sed -E 's/^(\S+) - - .*/\1/' access.log | sort | uniq -c | sort -nr | head

xargs (batching arguments)

git ls-files -z | xargs -0 wc -l | sort -nr | head
  • Pair -print0/-0 to handle spaces/newlines safely
  • Use -P for parallelism when available

tee (inspect and forward)

cmd | tee out.log | grep "ERROR"

Duplicates stream to a file and stdout.

Safe find + exec

find logs -type f -name '*.log' -print0 | xargs -0 gzip -9
# or
find logs -type f -name '*.log' -exec gzip -9 {} +

Use {} + to batch; {} \; runs once per file.

Performance tips

  • Minimize processes (combine with awk/sed when reasonable)
  • Use locale C for faster sort: LC_ALL=C sort
  • Avoid cat in the middle; stream directly

Summary

  • Chain simple tools with |; use xargs and tee for batching and tapping
  • Prefer -print0/-0 for safety with arbitrary filenames