ralacher / phpBB_crawlerLinks

Scrapy spider to crawl phpBB forums and extract information, allows for authentication

☆8

Alternatives and similar repositories for phpBB_crawler

Users that are interested in phpBB_crawler are comparing it to the libraries listed below

Sorting:

vinta / haul
An Extensible Image Crawler
☆160Updated 8 years ago
Alir3z4 / python-sanitize
Bringing sanity to world of messed-up data
☆66Updated 10 years ago
odie5533 / WarcMiddleware
WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.
☆47Updated 7 years ago
Zulko / twittcher
Python module to watch Twitter user pages or search-results.
☆62Updated 10 years ago
agarden / remove-pdf-watermark
Short script for removing watermarks from PDF files. Requires pdftk.
☆59Updated 6 years ago
ncouture / python-search-engine
Search engine base (crawler, indexer and parser) using Python, Celery, RabbitMQ, CouchDB and Whoosh.
☆11Updated last month
ArchiveTeam / NewsGrabber
Grabbing all news.
☆62Updated 5 years ago
jaysw / ipydb
Turn your IPython console into a cross-database SQL client
☆31Updated 9 years ago
NadalVRoMa / PyLibGen
A python script to download books from libgen.io
☆75Updated 6 years ago
jabbalaci / Jabba-Webkit
Jabba's headless webkit browser for scraping AJAX-powered webpages.
☆91Updated 10 years ago
matiasb / demiurge
PyQuery-based scraping micro-framework.
☆117Updated 3 years ago
nikoma / Old-Apphera-Dashboard
Open Source Social Media Monitoring And Engagement System Core/API
☆36Updated 10 years ago
CompileInc / cappy
☕🗄CAching Proxy in Python – Simple file based python http proxy
☆15Updated 3 years ago
wangjiezhe / FetchNovels
Fetch novels from internet
☆13Updated 4 years ago
18F / scrapebox
A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to…
☆24Updated 10 years ago
fullscale / pypes
A component based data flow framework with a drag-n-drop Web 2.0 interface. Based on Stackless Python and inspired by Yahoo! Pipes.
☆150Updated 12 years ago
bdesham / chrome-export
Python scripts to convert Google Chrome’s bookmarks and history to the standard HTML-ish bookmarks file format.
☆205Updated 3 years ago
alexandrevicenzi / fluentmail
Python SMTP client and Email for Humans™
☆82Updated 6 years ago
liberit / scraptils
scraper related helper functions
☆27Updated 11 years ago
narimiran / scopy
Python script for searching through your digital books and cataloguing them in an easy-to-share list of files.
☆31Updated 5 years ago
tuomas2 / automate
A general purpose Python automatization library with nifty real-time web UI
☆30Updated last month
sdushantha / pyradio
📻 Play your favorite radio station from the terminal
☆76Updated 5 years ago
dcondrey / scrapy-spiders
Collection of python scripts I have created to crawl various websites, mostly for lead generation projects to match keywords and collect …
☆131Updated last year
btimby / fulltext
Python library for extracting text from various file formats (for indexing).
☆113Updated 3 years ago
myles / python-wp
A Python library for interacting with WordPress REST API.
☆40Updated 3 years ago
asyne / cproto
Chrome Debugging client for Python
☆33Updated 5 years ago
arpit1997 / PQusic
A Music playlist app
☆13Updated 7 years ago
niwinz / phantompy
Phantompy is a headless WebKit engine with powerful pythonic api build on top of Qt5 Webkit
☆613Updated 8 years ago
scrapy-plugins / scrapy-magicfields
Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.
☆56Updated 3 years ago
fusionbox / mouseware
Secure random passwords in javascript
☆18Updated 5 years ago