wind2sing / aCrawlerLinks
π A powerful web-crawling framework, based on aiohttp.
β15Updated 5 years ago
Alternatives and similar repositories for aCrawler
Users that are interested in aCrawler are comparing it to the libraries listed below
Sorting:
- Pyppeteer integration for Scrapyβ58Updated 4 years ago
- Async wrapper for requests / aiohttp, and some crawler toolkits. Let synchronization code enjoy the performance of asynchronous programmiβ¦β24Updated 5 months ago
- Scrapy + Puppeteerβ110Updated 4 years ago
- python 3.7 asyncio tutorial.β14Updated 5 years ago
- tus.io protocol implementation for aiohttp.web applicationsβ17Updated 2 years ago
- Zyte Automatic Extraction integration for Scrapyβ56Updated 3 years ago
- A complimentary proxy to help to use SPM with headless browsersβ108Updated 2 years ago
- Use pyppeteer from a Scrapy spiderβ59Updated 5 years ago
- More flexible and featured Frontera scheduler for Scrapyβ37Updated last month
- Asyncio web crawling framework. Work in progress.β19Updated 11 months ago
- Extract structured data from HTML and XML documents like a boss.β49Updated 7 months ago
- Page Object pattern for Scrapyβ123Updated last week
- Web scraping Page Objects core libraryβ102Updated 2 weeks ago
- A tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.β93Updated 6 months ago
- A scrapy extension to sync `.scrapy` folder to an S3 bucketβ17Updated 3 years ago
- Simple Web UI for Scrapy spider management via Scrapydβ51Updated 7 years ago
- Library to populate items using XPath and CSS with a convenient APIβ48Updated 2 weeks ago
- A RabbitMQ Scheduler for Scrapyβ87Updated 2 years ago
- A simple, Qt-Webengine powered web browser with built in functionality for basic scrapy webscraping support.β109Updated last year
- Lightweight library that converts a HTML webpage to JSON data using a template defined in JSON.β23Updated last month
- A Scrapy middleware to bypass the CloudFlare's anti-bot protectionβ110Updated 4 years ago
- Trio driver for Chrome DevTools Protocol (CDP)β68Updated 3 years ago
- A decorator to write coroutine-like spider callbacks.β109Updated 2 years ago
- Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modificatiβ¦β102Updated 2 years ago
- Splash + HAProxy + Docker Composeβ197Updated 6 years ago
- Pre-built Scrapy spiders for AutoExtractβ19Updated last year
- Scrapy integration with Tor for anonymous web scrapingβ46Updated 9 years ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.β191Updated 3 years ago
- A Ruia plugin for loading javascript - pyppeteerβ18Updated 3 years ago
- An efficient and lightweight thread poolβ38Updated 4 years ago