TeamHG-Memex / scrapy-kafka-exportView external linksLinks
Scrapy extension which writes crawled items to Kafka
☆30Updated this week
Alternatives and similar repositories for scrapy-kafka-export
Users that are interested in scrapy-kafka-export are comparing it to the libraries listed below
Sorting:
- ☆12Oct 20, 2022Updated 3 years ago
- Deprecated HubStorage client library - please use python-scrapinghub>=1.9.0 instead☆16Dec 5, 2016Updated 9 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated this week
- Show summary of a large number of URLs in a Jupyter Notebook☆17Updated this week
- Kafka-based components for Scrapy☆78Apr 10, 2018Updated 7 years ago
- Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations☆40May 21, 2024Updated last year
- A Scrapy extension to log items coverage when the spider shuts down☆19Apr 11, 2020Updated 5 years ago
- Automatic unit test generation for Scrapy.☆57Jul 12, 2021Updated 4 years ago
- Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…☆55May 21, 2024Updated last year
- This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.☆1,230Nov 7, 2023Updated 2 years ago
- Scrapy middleware which allows to crawl only new content☆79Updated this week
- ☆13Jun 26, 2024Updated last year
- Flask based Movie Recommendation System☆12May 1, 2023Updated 2 years ago
- A CLI for benchmarking Scrapy.☆32Jun 28, 2025Updated 7 months ago
- Data-Science-Projects-in-Python☆11Jul 25, 2018Updated 7 years ago
- SocketLabs Email Delivery PHP Client Library☆10Dec 11, 2023Updated 2 years ago
- tokviz is a Python library for visualizing tokenization patterns across different language models.☆12Apr 25, 2024Updated last year
- Request per second / SQLop per second monitoring for Django, using Redis for storage☆101Sep 6, 2010Updated 15 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Dec 17, 2021Updated 4 years ago
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆119Updated this week
- HTTP API for Scrapy spiders☆879Updated this week
- Create Unlimited Facebook Account with Email and Number☆10Feb 24, 2021Updated 4 years ago
- An application that takes your current location, address, or latitude/longitude and returns a map showing crimes that have occurred near …☆10Dec 6, 2020Updated 5 years ago
- Performance tests for multinode NGC.Ready certification☆15Jan 28, 2026Updated 2 weeks ago
- Mass Domain Availability Check Script☆13Sep 5, 2020Updated 5 years ago
- ☆11Jun 7, 2023Updated 2 years ago
- Extract (DOM tree) repetitions from a webpage☆12Jan 13, 2014Updated 12 years ago
- Scraping Amazon website using Proxies for extracting Mobile details☆13May 30, 2019Updated 6 years ago
- Reverse IP Lookup Tool that allows you to use an IP address to identify all websites hosted on a server.☆10Jun 29, 2016Updated 9 years ago
- BlockCAT token sale smart contracts.☆11Oct 19, 2017Updated 8 years ago
- ☆10Aug 19, 2022Updated 3 years ago
- A python program that scrapes an Etsy URL and displays all sellers and their info. (Works as of Feb 2023)☆11Mar 8, 2023Updated 2 years ago
- In this repo I show how to simple create an API for your machine learning models in Python☆12Nov 28, 2018Updated 7 years ago
- Creates a pipeline Airflow and Scrapy to output an average image composition of everyone's face in a given website☆44Oct 13, 2017Updated 8 years ago
- Multi Browser Kango Extension for BGPView - A DNS and BGP network visualizer☆10May 16, 2017Updated 8 years ago
- Get tweets and save file in JSON format without Twitter API☆11Jan 1, 2019Updated 7 years ago
- urlscan.io API wrapper for Ruby☆13Oct 16, 2023Updated 2 years ago
- This project deals with hierarchical classification of web pages based on dmoz dataset.☆14Apr 10, 2014Updated 11 years ago
- Elevator is an open source, on-disk key-value store. Provides high-performance bulk read-write operations over very large datasets while …☆70May 14, 2014Updated 11 years ago