GGistDev

Strings and Runes in Go

Strings are read-only byte slices; runes represent Unicode code points.

Bytes vs runes

A string is a sequence of bytes, typically UTF-8. A rune is a Unicode code point. Converting to []rune decodes runes; converting to []byte exposes raw bytes.

s := "héllo"           // UTF-8 encoded
bs := []byte(s)         // bytes
rs := []rune(s)         // runes (Unicode code points)
_ = bs; _ = rs

Iteration

for range iterates runes, yielding the byte index and rune value. A classic index loop iterates bytes.

for i, r := range s {   // by rune
    _ = i  // byte index of rune
    _ = r  // rune value
}

for i := 0; i < len(s); i++ { // by byte
    _ = s[i]
}

Length

len(s) is bytes, not characters. Use len([]rune(s)) for rune count; beware normalization and grapheme clusters (emoji with skin tones, combined accents).

len(s)        // bytes
len([]rune(s))// runes

Indexing and slicing

  • s[i] is the i-th byte (not rune)
  • Slicing uses byte indices and must respect UTF-8 boundaries
// safe slice around rune boundaries
start := 0
for i := range s { start = i; break } // first rune start
_ = s[start:]

Conversions

s2 := string([]rune{'你','好'})
bytes := []byte("go")
str  := string(bytes)

Build strings efficiently

Use strings.Builder (or bytes.Buffer) to build strings without repeated allocations.

var b strings.Builder
b.WriteString("hello")
b.WriteByte('!')
result := b.String()

Common ops

strings.HasPrefix(s, "he")
strings.Contains(s, "é")
strings.ToUpper(s)
strings.Split(s, ",")

Normalization (note)

  • Some equal-looking characters may have different byte forms; use golang.org/x/text/unicode/norm if needed
  • Grapheme clusters (user-perceived characters) can span multiple runes; see golang.org/x/text/segmenter for precise operations

Summary

  • Strings are bytes; range yields runes
  • Use []rune to count/substring by characters; strings.Builder for efficient concatenation