Strings and Runes in Go
Strings are read-only byte slices; runes represent Unicode code points.
Bytes vs runes
A string is a sequence of bytes, typically UTF-8. A rune is a Unicode code point. Converting to []rune decodes runes; converting to []byte exposes raw bytes.
s := "héllo" // UTF-8 encoded
bs := []byte(s) // bytes
rs := []rune(s) // runes (Unicode code points)
_ = bs; _ = rs
Iteration
for range iterates runes, yielding the byte index and rune value. A classic index loop iterates bytes.
for i, r := range s { // by rune
_ = i // byte index of rune
_ = r // rune value
}
for i := 0; i < len(s); i++ { // by byte
_ = s[i]
}
Length
len(s) is bytes, not characters. Use len([]rune(s)) for rune count; beware normalization and grapheme clusters (emoji with skin tones, combined accents).
len(s) // bytes
len([]rune(s))// runes
Indexing and slicing
s[i]is the i-th byte (not rune)- Slicing uses byte indices and must respect UTF-8 boundaries
// safe slice around rune boundaries
start := 0
for i := range s { start = i; break } // first rune start
_ = s[start:]
Conversions
s2 := string([]rune{'你','好'})
bytes := []byte("go")
str := string(bytes)
Build strings efficiently
Use strings.Builder (or bytes.Buffer) to build strings without repeated allocations.
var b strings.Builder
b.WriteString("hello")
b.WriteByte('!')
result := b.String()
Common ops
strings.HasPrefix(s, "he")
strings.Contains(s, "é")
strings.ToUpper(s)
strings.Split(s, ",")
Normalization (note)
- Some equal-looking characters may have different byte forms; use
golang.org/x/text/unicode/normif needed - Grapheme clusters (user-perceived characters) can span multiple runes; see
golang.org/x/text/segmenterfor precise operations
Summary
- Strings are bytes;
rangeyields runes - Use
[]runeto count/substring by characters;strings.Builderfor efficient concatenation