Week 4 — Standard Library
The Python standard library has batteries included. Learn the most useful modules: datetime for time, pathlib for files, collections for advanced data structures, and itertools for efficient looping.
What you'll learn
- 1Parse and format dates with datetime
- 2Navigate the filesystem with pathlib.Path
- 3Use Counter, defaultdict, and deque from collections
- 4Apply itertools.chain, groupby, combinations, permutations
- 5Generate random data with the random module
1. datetime
from datetime import datetime, timedelta, date
now = datetime.now()
print(now.strftime("%Y-%m-%d %H:%M:%S"))
bday = date(1990, 6, 15)
today = date.today()
age = (today - bday).days // 365
print(f"Age: {age} years")
next_week = now + timedelta(days=7)
print(next_week.date())2. pathlib
from pathlib import Path
p = Path(".")
for f in p.glob("*.py"):
print(f.name, f.stat().st_size, "bytes")
data = Path("data.txt")
data.write_text("Hello, pathlib!")
print(data.read_text())
data.unlink() # delete3. collections
from collections import Counter, defaultdict, deque
# Counter: count occurrences
words = "the cat sat on the mat".split()
print(Counter(words).most_common(3))
# defaultdict: no KeyError on missing key
dd = defaultdict(list)
dd["fruits"].append("apple")
# deque: efficient append/pop at both ends
q = deque([1, 2, 3])
q.appendleft(0)
q.append(4)
print(list(q)) # [0, 1, 2, 3, 4]4. itertools
from itertools import chain, combinations, permutations, groupby
print(list(chain([1,2], [3,4], [5])))
print(list(combinations("ABC", 2)))
print(list(permutations([1,2,3], 2)))
data = [("a",1),("a",2),("b",3),("b",4)]
for key, group in groupby(data, key=lambda x: x[0]):
print(key, list(group))When to reach for what (and the gotchas)
These modules overlap with things you could hand-roll, but the standard library version is faster, tested, and clearer. Here's the quick decision table and the trap each one hides.
| Need | Reach for | Common gotcha |
|---|---|---|
| Calendar dates / deadlines | datetime.date | Subtracting dates gives a timedelta — read .days, don't compare strings |
| Date + time of day | datetime.datetime | Naive vs timezone-aware: don't mix them in subtraction |
| Filesystem paths | pathlib.Path | Path objects, not strings — wrap with str(p) only at the boundary |
| Counting occurrences | collections.Counter | Ties keep insertion order, not alphabetical |
| Missing-key defaults | collections.defaultdict | Reading a missing key CREATES it |
| Fast both-end queue | collections.deque | Random indexing (q[5000]) is O(n), unlike a list |
| Grouping rows | itertools.groupby | Only groups CONSECUTIVE keys — sort first |
Rule of thumb: if you're about to write a loop that counts, buckets, or pairs things up, there's usually a one-liner in collections or itertools that does it faster and reads better.
Common Mistakes (FAQ)
Q. datetime or date — which should I use?
Use date for calendar days (birthdays, deadlines, 'days until'); use datetime when the time of day matters. Subtracting either gives a timedelta — read .days or .total_seconds(), never compare formatted strings.
Q. defaultdict created a key I only meant to read — why?
Indexing a missing key on a defaultdict runs the factory and INSERTS it. If you only want to check, use d.get(key) or `key in d` instead of d[key].
Q. itertools.groupby returned weird/empty groups.
groupby only groups CONSECUTIVE items with the same key. Sort the data by that same key first (sorted(data, key=...)), otherwise identical keys scattered through the list become separate groups.
Q. Counter.most_common() ties aren't alphabetical — is that a bug?
No. Equal counts keep their first-seen (insertion) order in Python 3.7+. If you need a deterministic tie-break, sort with a secondary key: sorted(c.items(), key=lambda kv: (-kv[1], kv[0])).
Q. pathlib or os.path?
Prefer pathlib — it's the modern, object-oriented API (p / 'sub' / 'file.txt', p.read_text(), p.glob(...)). Reach for os.path only when an older library hands you string paths.
💻 Examples
Run these examples and check the output yourself.
from pathlib import Path
from collections import Counter
path = Path(".").resolve()
all_files = [f for f in path.rglob("*") if f.is_file()]
exts = Counter(f.suffix or "(no ext)" for f in all_files)
total_size = sum(f.stat().st_size for f in all_files)
print(f"Total files: {len(all_files)}")
print(f"Total size: {total_size/1024:.1f} KB")
print("\nTop file types:")
for ext, cnt in exts.most_common(5):
print(f" {ext:<12} {cnt}")
from datetime import date, timedelta
today = date.today()
print(f"Today: {today}")
print(f"100 days later: {today + timedelta(days=100)}")
# Days until New Year
new_year = date(today.year + 1, 1, 1)
print(f"Days to New Year: {(new_year - today).days}")
📝 Exercises
Try them yourself first, then open the solution to compare.
Word Frequency Analyzer
Goal: Read a text file and produce a frequency report using Counter.
- Read file with pathlib
- Tokenize into words (lowercase, strip punctuation)
- Print top-10 words with counts and a bar chart (ASCII)
▶Toggle solution
from pathlib import Path
from collections import Counter
import re
text = Path("sample.txt").read_text(errors="ignore").lower()
words = re.findall(r"[a-z]+", text)
top = Counter(words).most_common(10)
max_c = top[0][1]
for word, count in top:
bar = "█" * (count * 20 // max_c)
print(f"{word:<15} {count:5d} {bar}")
All lecture materials and example code are openly available on GitHub.
View on GitHub ↗