🚀

고급 (Advanced)

타입 + 비동기 + 외부 라이브러리 + 테스트 + 배포

10주차 — 고급 종합 실습

비동기·외부 라이브러리·테스트·배포를 통합한 실전 프로젝트. 패키지 구조 + CI 적용까지 포함합니다.

capstoneasyncpytestpackaging

소요 시간

⏱ 프로젝트형

난이도

📊 고급

선수 조건

🎯 고급 1~9주차 전부

결과물

통합 패키지 프로젝트 완성

이 강의에서 배우는 것

1비동기·외부 라이브러리·테스트·배포를 통합한 실전 프로젝트를 완성한다
2패키지 구조 + CI 를 적용한다

1. 프로젝트 구조 (권장)

text

my_project/
├── pyproject.toml
├── README.md
├── src/my_project/
│   ├── __init__.py
│   ├── models.py        (dataclass)
│   ├── client.py        (httpx 비동기)
│   ├── analyzer.py      (Pandas)
│   └── cli.py           (argparse 진입점)
├── tests/
│   ├── test_models.py
│   ├── test_client.py
│   └── test_analyzer.py
└── .github/workflows/ci.yml   (선택)

2. 골격 — 비동기 크롤러 + 분석

models.py: dataclass 로 도메인 정의 → client.py: httpx 비동기 fetch → analyzer.py: Pandas 집계 → cli.py: argparse 로 진입점.

3. 체크리스트

타입 힌트 + mypy --strict
pytest 테스트 (커버리지 70%+)
CLI (argparse 또는 click)
비동기 또는 병렬 처리
외부 라이브러리 (requests/httpx, pandas 등)
pyproject.toml 패키지화
README (설치법, 사용법)

자주 하는 실수

모놀리식 한 파일 — 테스트도 어렵고 재사용도 어려움
타입 힌트 누락 — IDE 지원 안 받음
테스트 없이 시작 — 나중에 더 어려움
README 없음 — 본인도 며칠 후 잊어버림

FAQ

Q1. CI 가 뭔가요? — Continuous Integration. push마다 자동 테스트(GitHub Actions).

Q2. 어떤 도메인? — 본인이 진짜 쓸 만한 것. 뉴스 요약, 공부 시간 추적, 시세 알림 등.

최종 과제 — 4개 중 1개 선택

비동기 크롤러 + 분석
CLI 도구 패키지 (PyPI 배포)
데이터 파이프라인 (CSV → 정제 → 시각화)
자동화 봇 (Slack/Discord 웹훅)

다음 단계

고급 과정 완주 🎉. 이후 도메인별 학습:

백엔드: FastAPI / Django REST framework
데이터·ML: PyTorch / scikit-learn / Hugging Face
자동화·DevOps: Ansible / Airflow / Docker
웹 스크래핑: Scrapy / Playwright

💻 예제 (examples)

실제로 실행해 결과를 확인할 수 있는 예제 코드입니다.

models.py— dataclass 도메인

CODE

from dataclasses import dataclass, field

@dataclass
class Article:
    title: str
    url: str
    score: int = 0
    tags: list[str] = field(default_factory=list)

a = Article("파이썬 비동기", "https://x", score=120, tags=["python", "async"])
print(a)

▶ 실행 결과

Article(title='파이썬 비동기', url='https://x', score=120, tags=['python', 'async'])

client.py— httpx 비동기 fetch

CODE

import asyncio
import httpx

async def fetch_one(client, url):
    r = await client.get(url, timeout=10)
    return r.status_code, len(r.text)

async def fetch_all(urls):
    async with httpx.AsyncClient() as client:
        return await asyncio.gather(*(fetch_one(client, u) for u in urls))

if __name__ == "__main__":
    urls = ["https://example.com"] * 3
    print(asyncio.run(fetch_all(urls)))

▶ 실행 결과

[(200, 1256), (200, 1256), (200, 1256)]

analyzer.py— Pandas 집계

CODE

import pandas as pd

def by_dept(df: pd.DataFrame) -> pd.Series:
    return df.groupby("dept")["score"].mean()

if __name__ == "__main__":
    df = pd.DataFrame({"name": ["A", "B", "C", "D"], "score": [90, 70, 80, 60], "dept": ["X", "Y", "X", "Y"]})
    print(by_dept(df))

▶ 실행 결과

dept
X    85.0
Y    65.0
Name: score, dtype: float64

cli.py— argparse 진입점

CODE

import argparse

def main():
    p = argparse.ArgumentParser(prog="myapp")
    sub = p.add_subparsers(dest="cmd", required=True)
    sub.add_parser("crawl")
    a = sub.add_parser("analyze")
    a.add_argument("path")
    args = p.parse_args()

    if args.cmd == "crawl":
        print("크롤 시작")
    elif args.cmd == "analyze":
        print(f"분석: {args.path}")

if __name__ == "__main__":
    main()

▶ 실행 결과

$ myapp crawl
크롤 시작
$ myapp analyze data.csv
분석: data.csv

📝 과제 (exercises)

직접 풀어보고, 막힐 때 정답을 펼쳐 비교해보세요.

과제 1

최종 1 — 비동기 크롤러

목표: 여러 URL 을 동시 fetch 하고 길이/응답시간을 출력한다.

요구사항

httpx + asyncio.gather
각 URL 의 (url, status, ms, bytes) 출력
타임아웃 10초

💡 힌트

time.perf_counter()

asyncio.gather

입출력 예시

https://example.com 200 305ms 1256B
https://httpbin.org/get 200 412ms 380B

채점

· 비동기 동시 실행
· 에러 처리

▶정답 코드 펼치기 / 접기

SOLUTION

import asyncio
import time
import httpx

async def fetch(client, url):
    start = time.perf_counter()
    try:
        r = await client.get(url, timeout=10)
        ms = int((time.perf_counter() - start) * 1000)
        return f"{url} {r.status_code} {ms}ms {len(r.text)}B"
    except Exception as e:
        return f"{url} ERROR {e}"

async def main(urls):
    async with httpx.AsyncClient() as client:
        results = await asyncio.gather(*(fetch(client, u) for u in urls))
    for line in results:
        print(line)

asyncio.run(main(["https://example.com", "https://httpbin.org/get"]))

▶ 실행 결과

https://example.com 200 305ms 1256B
https://httpbin.org/get 200 412ms 380B

과제 2

최종 2 — CLI 패키지

목표: argparse 기반 CLI 패키지를 만들어 로컬 설치한다.

요구사항

src/myapp/cli.py + pyproject.toml
myapp greet --name=홍길동 → 인사 출력

💡 힌트

[project.scripts] myapp = 'myapp.cli:main'

입출력 예시

안녕, 홍길동!

채점

· pip install -e . 동작
· CLI 출력 일치

▶정답 코드 펼치기 / 접기

SOLUTION

# pyproject.toml
[project]
name = "myapp"
version = "0.1.0"
[project.scripts]
myapp = "myapp.cli:main"
[build-system]
requires = ["setuptools>=61"]
build-backend = "setuptools.build_meta"

# src/myapp/cli.py
import argparse

def main():
    p = argparse.ArgumentParser(prog="myapp")
    sub = p.add_subparsers(dest="cmd", required=True)
    g = sub.add_parser("greet")
    g.add_argument("--name", required=True)
    args = p.parse_args()
    if args.cmd == "greet":
        print(f"안녕, {args.name}!")

# pip install -e .
# myapp greet --name=홍길동

▶ 실행 결과

안녕, 홍길동!

과제 3

최종 3 — 데이터 파이프라인

목표: CSV → 필터 → 집계 → 출력 흐름을 짠다.

요구사항

pandas 로 read_csv
score >= 80 만 필터
dept 별 mean 출력

💡 힌트

StringIO 로 인라인 CSV

입출력 예시

dept
A    88.5
Name: score, dtype: float64

채점

· 필터
· groupby + mean

▶정답 코드 펼치기 / 접기

SOLUTION

import pandas as pd
from io import StringIO

csv = """name,score,dept
Alice,92,A
Bob,78,B
Charlie,85,A
Dave,67,B"""

df = pd.read_csv(StringIO(csv))
high = df[df.score >= 80]
print(high.groupby("dept")["score"].mean())

▶ 실행 결과

dept
A    88.5
Name: score, dtype: float64

예제 코드 / 강의 자료

전체 강의 자료와 예제 코드는 GitHub에서 자유롭게 받아볼 수 있습니다.

GitHub에서 보기 ↗