📦 Installation Guide

This guide will help you install and set up the Advanced Python Web Scraper on your system.

🔧 System Requirements

📥 Installation Methods

Method 1: Clone from GitHub

git clone https://github.com/I-invincib1e/advanced-python-scraper.git cd advanced-python-scraper

Method 2: Direct Download

Download the ZIP file from GitHub and extract it.

🐍 Python Dependencies

Core Dependencies (Required)

pip install aiohttp beautifulsoup4 tqdm pyyaml

Backend-Specific Dependencies (Optional)

For JavaScript Rendering (Recommended):

pip install playwright playwright install # Install browser binaries
💡 Recommendation: Use Playwright over requests-html for JavaScript sites. Playwright is actively maintained, faster, and more reliable.

For JavaScript Rendering (Legacy):

pip install requests-html # Not recommended - unmaintained

Install All Backends:

pip install aiohttp beautifulsoup4 tqdm pyyaml playwright playwright install

✅ Verification

Test your installation:

python -c "import aiohttp, bs4, tqdm, yaml; print('✅ Core dependencies installed')"
python advanced_scraper.py --help # Should show help information

⚙️ Configuration

Create a configuration file:

{ "rate_limiting": { "delay": 0.2, "enabled": true }, "retry": { "max_attempts": 5, "backoff_factor": 1.0 }, "user_agent_rotation": { "enabled": true }, "concurrency": { "max_concurrent": 8 } }

🚀 Quick Test

Test the scraper with a simple example:

import asyncio from advanced_scraper import AdvancedBookScraper async def test(): async with AdvancedBookScraper("https://httpbin.org") as scraper: content = await scraper.scrape_single_page("https://httpbin.org/html") print(f"✅ Scraped {len(content)} characters") asyncio.run(test())
🎉 Installation Complete!
Your Advanced Python Web Scraper is now ready to use.

📚 Next Steps

View Examples API Reference GitHub Repository