Skip to content

python-web-scraping-com/examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Web Scraping — Examples

CI License: MIT

Runnable, tested Python web scraping examples — the companion code for python-web-scraping.com.

Every example is a small, self-contained function with a test. Each module maps to a guide on the site, so you can read the explanation there and run the code here.

Quick start

git clone https://github.com/python-web-scraping-com/examples.git
cd examples
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

Run any example directly:

python -m pws_examples.parsing
python -m pws_examples.storage
python -m pws_examples.async_scrape   # this one makes real HTTP requests

Run the tests and linter:

pytest
ruff check .

What's inside

Module What it shows Guide
pws_examples/parsing.py Parse product cards and HTML tables with BeautifulSoup Parsing HTML with BeautifulSoup
pws_examples/extraction.py Extract emails and phone numbers with regex Extracting Data with Regular Expressions
pws_examples/http_client.py A requests session with retries and backoff Understanding HTTP Requests and Responses
pws_examples/storage.py Validate with Pydantic, store in SQLite, de-duplicate Storing and Exporting Scraped Data
pws_examples/async_scrape.py Concurrent fetching with asyncio + HTTPX and a semaphore Asynchronous Scraping with asyncio and HTTPX

The tests use local HTML fixtures and httpx.MockTransport, so the suite is deterministic and runs offline — no live websites are hit in CI.

Scrape responsibly

These examples are for learning. When scraping real sites, respect robots.txt, rate-limit your requests, identify your client honestly, and follow each site's terms of service.

Contributing

Contributions are welcome. Please keep each example small and focused, add a test for it, and make sure pytest and ruff check . pass before opening a PR.

License

MIT © Python Web Scraping

About

Runnable, tested Python web scraping examples — companion code for python-web-scraping.com

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages