Runnable, tested Python web scraping examples — the companion code for python-web-scraping.com.
Every example is a small, self-contained function with a test. Each module maps to a guide on the site, so you can read the explanation there and run the code here.
git clone https://github.com/python-web-scraping-com/examples.git
cd examples
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"Run any example directly:
python -m pws_examples.parsing
python -m pws_examples.storage
python -m pws_examples.async_scrape # this one makes real HTTP requestsRun the tests and linter:
pytest
ruff check .| Module | What it shows | Guide |
|---|---|---|
pws_examples/parsing.py |
Parse product cards and HTML tables with BeautifulSoup | Parsing HTML with BeautifulSoup |
pws_examples/extraction.py |
Extract emails and phone numbers with regex | Extracting Data with Regular Expressions |
pws_examples/http_client.py |
A requests session with retries and backoff |
Understanding HTTP Requests and Responses |
pws_examples/storage.py |
Validate with Pydantic, store in SQLite, de-duplicate | Storing and Exporting Scraped Data |
pws_examples/async_scrape.py |
Concurrent fetching with asyncio + HTTPX and a semaphore | Asynchronous Scraping with asyncio and HTTPX |
The tests use local HTML fixtures and httpx.MockTransport, so the suite is
deterministic and runs offline — no live websites are hit in CI.
These examples are for learning. When scraping real sites, respect robots.txt,
rate-limit your requests, identify your client honestly, and follow each site's
terms of service.
Contributions are welcome. Please keep each example small and focused, add a test
for it, and make sure pytest and ruff check . pass before opening a PR.
MIT © Python Web Scraping