rmax / scrapy-boilerplateView external linksLinks
Small set of utilities to simplify writing Scrapy spiders.
☆49Jul 24, 2015Updated 10 years ago
Alternatives and similar repositories for scrapy-boilerplate
Users that are interested in scrapy-boilerplate are comparing it to the libraries listed below
Sorting:
- A helper to create web scrapers using scrapy selector in a Model based structure☆57Dec 26, 2022Updated 3 years ago
- provisioning a VPS to run Django with Ansible☆15May 13, 2025Updated 9 months ago
- Paginating the web☆37Feb 11, 2014Updated 12 years ago
- Clarify your words with emojis☆12Aug 25, 2016Updated 9 years ago
- Extensions for using Scrapy on Amazon AWS☆32Dec 5, 2012Updated 13 years ago
- A decorator to write coroutine-like spider callbacks.☆109Dec 26, 2022Updated 3 years ago
- Detect and classify pagination links☆15Sep 9, 2020Updated 5 years ago
- Scrapy downloader middleware that stores response HTMLs to disk.☆18Jan 14, 2026Updated last month
- Crochet-based blocking API for Scrapy.☆46Feb 24, 2017Updated 8 years ago
- ☆143Nov 24, 2015Updated 10 years ago
- Restrict crawl and scraping scope using matchers.☆26Jun 8, 2016Updated 9 years ago
- Tool to flatten stream of JSON-like objects, configured via schema☆33Oct 19, 2019Updated 6 years ago
- ☆68Sep 7, 2018Updated 7 years ago
- Python for students in humanities, NRU HSE, 2018-2019☆18Mar 7, 2023Updated 2 years ago
- Argument Parsing for Humans™☆207Jul 7, 2017Updated 8 years ago
- This is a collection of mostly R code to use text mining to analyse conference abstracts, blogs and other sources in an attempt to look f…☆42Sep 9, 2015Updated 10 years ago
- A scrapy pipeline which send items to Elastic Search server☆98Jan 2, 2018Updated 8 years ago
- Sustainable Open Source: The Book (Maybe)☆31Dec 27, 2016Updated 9 years ago
- Python-based cross-platform tool for mining text data (html, transcript, problems) of edX MOOCs on a user's dashboard. It is an extension…☆10Feb 12, 2020Updated 6 years ago
- How to add formulas to Google Spreadsheet using Google Apps Script - Sarmad Gardezi☆17Apr 24, 2025Updated 9 months ago
- A Sublime Text plugin to move through and reform things☆179Sep 28, 2023Updated 2 years ago
- A collection of github workflow patterns☆10Feb 1, 2024Updated 2 years ago
- Python wrapper for Apache OpenNLP tools☆34Nov 23, 2016Updated 9 years ago
- Wordpress plugin for Magic the Gathering that enables card tooltips and formatted deck listings.☆13Dec 24, 2025Updated last month
- NER toolkit for HTML data☆259May 3, 2024Updated last year
- This is a project crawling backpack information and images from Amazon using python scrapy and store data to sqlite database.☆34Sep 25, 2015Updated 10 years ago
- Output scrapy statistics to graphite/carbon☆54Mar 9, 2013Updated 12 years ago
- Python scrappers for legal NZ tv/movie streaming sites☆11Dec 13, 2015Updated 10 years ago
- Rossmann Store Sales: https://www.kaggle.com/c/rossmann-store-sales☆10May 13, 2018Updated 7 years ago
- A generic crawler☆78Updated this week
- PyData Boston 2013 talks: "Intro to scikit-learn" & "Realtime Predictive Analytics: Using scikit-learn and RabbitMQ"☆11Jan 5, 2014Updated 12 years ago
- Materials and reproducible workflows for working with health care data☆12Apr 11, 2018Updated 7 years ago
- Static photoessay generator using gulp.js☆10Mar 20, 2019Updated 6 years ago
- A program designed to merge banking data from UP Bank Australia into Firefly III☆13Sep 9, 2024Updated last year
- ☆12Apr 24, 2017Updated 8 years ago
- Краулеры для проекта Taiga Corpus и Taiga Parser, скачивание ресурсов из открытых источников☆14Apr 9, 2019Updated 6 years ago
- A set of R scripts to visualize and analyze bias in the polls☆24Sep 21, 2013Updated 12 years ago
- A generic interface wrapping multiple backends to provide a consistent pubsub API☆13Oct 31, 2018Updated 7 years ago
- DeepAlign: Alignment-based Process Anomaly Correction Using Recurrent Neural Networks☆10Mar 25, 2023Updated 2 years ago