Python Web Scraping — Examples

Runnable, tested Python web scraping examples — the companion code for python-web-scraping.com.

Every example is a small, self-contained function with a test. Each module maps to a guide on the site, so you can read the explanation there and run the code here.

Quick start

git clone https://github.com/python-web-scraping-com/examples.git
cd examples
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

Run any example directly:

python -m pws_examples.parsing
python -m pws_examples.storage
python -m pws_examples.async_scrape   # this one makes real HTTP requests

Run the tests and linter:

pytest
ruff check .

What's inside

Module	What it shows	Guide
`pws_examples/parsing.py`	Parse product cards and HTML tables with BeautifulSoup	Parsing HTML with BeautifulSoup
`pws_examples/extraction.py`	Extract emails and phone numbers with regex	Extracting Data with Regular Expressions
`pws_examples/http_client.py`	A `requests` session with retries and backoff	Understanding HTTP Requests and Responses
`pws_examples/storage.py`	Validate with Pydantic, store in SQLite, de-duplicate	Storing and Exporting Scraped Data
`pws_examples/async_scrape.py`	Concurrent fetching with asyncio + HTTPX and a semaphore	Asynchronous Scraping with asyncio and HTTPX

The tests use local HTML fixtures and httpx.MockTransport, so the suite is deterministic and runs offline — no live websites are hit in CI.

Scrape responsibly

These examples are for learning. When scraping real sites, respect robots.txt, rate-limit your requests, identify your client honestly, and follow each site's terms of service.

Contributing

Contributions are welcome. Please keep each example small and focused, add a test for it, and make sure pytest and ruff check . pass before opening a PR.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
pws_examples		pws_examples
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Web Scraping — Examples

Quick start

What's inside

Scrape responsibly

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Python Web Scraping — Examples

Quick start

What's inside

Scrape responsibly

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages