String Operations in Python
Strings are immutable Unicode sequences with rich manipulation APIs and formatting options.
Creating strings and basics
Use single, double, or triple quotes; raw strings skip escapes.
s = "hello\nworld"
raw = r"path\to\file"
multiline = """Line 1
Line 2"""
Common methods
" hi ".strip() # "hi"
"a,b,c".split(",") # ["a","b","c"]
"test".replace("t", "T") # "TesT"
"go" * 3 # "gogogo"
"python".startswith("py")
"py".upper(), "PY".lower(), "Py".swapcase()
Formatting
Prefer f‑strings; use format specifiers for alignment, precision, and number formatting.
name, n = "Ada", 42
f"{name} = {n}" # f‑string
"{} = {}".format(name, n) # str.format
"%s = %d" % (name, n) # legacy
pi = 3.14159
f"pi ≈ {pi:.2f}" # pi ≈ 3.14
f"{n:04d}" # zero‑pad 4 digits
Joining and building
Join with separator; use io.StringIO or list + "".join for many concatenations.
"-".join(["a", "b"]) # "a-b"
from io import StringIO
buf = StringIO(); buf.write("a"); buf.write("b"); s = buf.getvalue()
Searching
s.find("lo") # index or -1
s.index("lo") # raises ValueError if not found
s.count("l")
Encoding and bytes
Encode to bytes and decode back; specify encodings explicitly.
b = "café".encode("utf-8")
text = b.decode("utf-8")
Normalization and graphemes
Equal‑looking strings may have different codepoint sequences. Use unicodedata.normalize for canonical forms. User‑perceived characters (graphemes) can span multiple codepoints.
import unicodedata as ud
ud.normalize("NFC", "e\u0301") # "é"
Regex (preview)
See Regular Expressions section; use re for complex searching and substitution.
Summary
- Strings are immutable; use methods that return new strings
- Prefer f‑strings; join efficiently; handle encodings and normalization explicitly