Scrapy extension which writes crawled items to Kafka
☆30Feb 10, 2026Updated 3 weeks ago
Alternatives and similar repositories for scrapy-kafka-export
Users that are interested in scrapy-kafka-export are comparing it to the libraries listed below
Sorting:
- ☆12Oct 20, 2022Updated 3 years ago
- Scrapy Eagle is a tool that allow us to run any Scrapy based project in a distributed fashion and monitor how it is going on and how many…☆24Sep 4, 2020Updated 5 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆12Feb 23, 2026Updated 2 weeks ago
- Kafka-based components for Scrapy☆78Apr 10, 2018Updated 7 years ago
- Show summary of a large number of URLs in a Jupyter Notebook☆17Feb 10, 2026Updated 3 weeks ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40May 21, 2024Updated last year
- Scrapy schema validation pipeline and Item builder using JSON Schema☆45Mar 26, 2021Updated 4 years ago
- Automatic unit test generation for Scrapy.☆57Jul 12, 2021Updated 4 years ago
- a mutiple processes timed rotate logging file handler(base logging.RotatingFileHandler, ConcurrentLogHandler)☆22Dec 16, 2022Updated 3 years ago
- Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…☆55May 21, 2024Updated last year
- This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.☆1,229Nov 7, 2023Updated 2 years ago
- Skinfer is a tool for inferring and merging JSON schemas☆141Apr 24, 2024Updated last year
- Scrapy middleware which allows to crawl only new content☆79Feb 10, 2026Updated 3 weeks ago
- A CLI for benchmarking Scrapy.☆32Jun 28, 2025Updated 8 months ago
- Data-Science-Projects-in-Python☆11Jul 25, 2018Updated 7 years ago
- movie-recommendation-system-GUI☆10Aug 15, 2020Updated 5 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Dec 17, 2021Updated 4 years ago
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆120Feb 23, 2026Updated 2 weeks ago
- HTTP API for Scrapy spiders☆881Feb 16, 2026Updated 3 weeks ago
- Reverse IP Lookup Tool that allows you to use an IP address to identify all websites hosted on a server.☆10Jun 29, 2016Updated 9 years ago
- Scraping Amazon website using Proxies for extracting Mobile details☆13May 30, 2019Updated 6 years ago
- ☆11Jun 7, 2023Updated 2 years ago
- Create Unlimited Facebook Account with Email and Number☆10Feb 24, 2021Updated 5 years ago
- Extract (DOM tree) repetitions from a webpage☆12Jan 13, 2014Updated 12 years ago
- ☆10Aug 19, 2022Updated 3 years ago
- BlockCAT token sale smart contracts.☆11Oct 19, 2017Updated 8 years ago
- ☆20Nov 16, 2014Updated 11 years ago
- ホッテントリ感情分析☆12Apr 8, 2018Updated 7 years ago
- In this repo I show how to simple create an API for your machine learning models in Python☆12Nov 28, 2018Updated 7 years ago
- Zipkin client for asgi. Compatible with Starlette Framework and Jaeger tracing server☆10Apr 21, 2024Updated last year
- bigram / trigram analysis of wikipedia; mainly mutual info☆22Mar 6, 2012Updated 14 years ago
- Standalone firmware for iHeater — a chamber heater controller for 3D printers. Works independently or integrates with Klipper over USB.☆13Feb 25, 2026Updated last week
- Will send the same request to one or more sources to exchange cost for reduced latency for inference☆11Dec 17, 2024Updated last year
- Provides a basic integration for ipfs (storage/distribution) and ethereum blockchain (validation/authorization) based EDIfact message exc…☆11Jul 27, 2016Updated 9 years ago
- A semantic web crawler☆20Sep 20, 2010Updated 15 years ago
- ☆13Jul 16, 2013Updated 12 years ago
- This project deals with hierarchical classification of web pages based on dmoz dataset.☆14Apr 10, 2014Updated 11 years ago
- Elevator is an open source, on-disk key-value store. Provides high-performance bulk read-write operations over very large datasets while …☆70May 14, 2014Updated 11 years ago
- 基于 Redis 官方分布式锁文章的 Python 实现☆10Jan 16, 2021Updated 5 years ago