scrapinghub/autopager

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scrapinghub/autopager)

scrapinghub / autopager

Detect and classify pagination links

☆15

Alternatives and similar repositories for autopager

Users that are interested in autopager are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TeamHG-Memex / url-summary
View on GitHub
Show summary of a large number of URLs in a Jupyter Notebook
☆19Apr 8, 2026Updated 3 months ago
scrapinghub / page_finder
View on GitHub
Find which links on a web page are pagination links
☆29Jan 12, 2017Updated 9 years ago
scrapinghub / scrapy-mosquitera
View on GitHub
Restrict crawl and scraping scope using matchers.
☆26Jun 8, 2016Updated 10 years ago
scrapinghub / flatson
View on GitHub
Tool to flatten stream of JSON-like objects, configured via schema
☆33Oct 19, 2019Updated 6 years ago
scrapinghub / autologin
View on GitHub
A project to attempt to automatically login to a website given a single seed
☆11Jun 17, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xtannier / WebAnnotator
View on GitHub
WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…
☆48Dec 17, 2021Updated 4 years ago
scrapinghub / aile
View on GitHub
Automatic Item List Extraction
☆85Jun 15, 2016Updated 10 years ago
scrapinghub / webpager
View on GitHub
Paginating the web
☆37Feb 11, 2014Updated 12 years ago
psapezhka / grafana-dashboards
View on GitHub
Set of useful grafana dashboards
☆14Apr 15, 2021Updated 5 years ago
hoptical / grafana-skype-alerts
View on GitHub
A webhook notifier for sending Grafana alerts to Skype
☆10Jun 28, 2024Updated 2 years ago
pydepta / pydepta
View on GitHub
A python implementation of DEPTA
☆84Jan 14, 2017Updated 9 years ago
arnupretorius / RWCPrediction
View on GitHub
R files containing the code used to predict rugby world cup matches
☆11Sep 18, 2015Updated 10 years ago
matthewruttley / mozclassify
View on GitHub
Algorithms for URL Classification
☆19Apr 13, 2015Updated 11 years ago
rmax / scrapydo
View on GitHub
Crochet-based blocking API for Scrapy.
☆47Feb 24, 2017Updated 9 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
best-doctor / flake8-expression-complexity
View on GitHub
flake8 plugin to validate expressions complexity
☆33Mar 19, 2022Updated 4 years ago
scrapinghub / exporters
View on GitHub
Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations
☆39May 21, 2024Updated 2 years ago
lestrrat-go / urlenc
View on GitHub
Marshal/Unmarshal interface for structs that can encode/decode themselves to URL query strings
☆11Jun 6, 2018Updated 8 years ago
TeamHG-Memex / arachnado
View on GitHub
Web Crawling UI and HTTP API, based on Scrapy and Tornado
☆162Apr 8, 2026Updated 3 months ago
amiralimadadi / Divar_WebScrap
View on GitHub
Web scrap on divar website (Tehran) to generate a dataset on housing price in Tehran.
☆17Mar 16, 2024Updated 2 years ago
clips / hades
View on GitHub
Repository for the CLiPS HAte speech DEtection System [HADES].
☆25Apr 5, 2018Updated 8 years ago
TeamHG-Memex / extract-html-diff
View on GitHub
extract difference between two html pages
☆33Apr 8, 2026Updated 3 months ago
OpenScraping / openscraping-lib-nodejs
View on GitHub
Turn unstructured HTML pages into structured data. The OpenScraping library can extract information from HTML pages using a JSON config f…
☆12Aug 23, 2018Updated 7 years ago
torch / sys
View on GitHub
A system utility package for Torch.
☆13Dec 22, 2017Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
postgrespro / pg_tsparser
View on GitHub
pg_tsparser - parser for text search
☆17Jun 26, 2026Updated last month
souvikg10 / spacy-fasttext
View on GitHub
The code describes how to load fastText vectors onto spaCy
☆18Jan 28, 2021Updated 5 years ago
dansandland / cassandra-scrapy-pipeline
View on GitHub
☆13Dec 10, 2015Updated 10 years ago
scrapinghub / skinfer
View on GitHub
Skinfer is a tool for inferring and merging JSON schemas
☆141Apr 24, 2024Updated 2 years ago
fuchsnj / rust_pubsub
View on GitHub
Rust Local Publish Subscribe
☆15Oct 3, 2017Updated 8 years ago
skytreader / CleverAlgorithms-Python
View on GitHub
The Clever Algorithms project is an effort to describe a large number of algorithmic techniques from the field of Artificial Intelligence…
☆29Oct 28, 2018Updated 7 years ago
sebcrozet / rs2cl
View on GitHub
Write OpenCL kernels in rust.
☆12Sep 28, 2013Updated 12 years ago
gr33ndata / dmoz-urlclassifier
View on GitHub
Preparing DMOZ dataset for my n-Gram LM-based URL classification research
☆31Aug 30, 2014Updated 11 years ago
scrapinghub / extruct
View on GitHub
Extract embedded metadata from HTML markup
☆967Apr 1, 2026Updated 3 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
praekelt / django-category
View on GitHub
Django categorize content app.
☆46Jan 4, 2019Updated 7 years ago
walidsa3d / routerpass
View on GitHub
Find your router's default password
☆14Apr 7, 2015Updated 11 years ago
scrapinghub / scrapy-autounit
View on GitHub
Automatic unit test generation for Scrapy.
☆58Jul 12, 2021Updated 5 years ago
Parsely / serpextract
View on GitHub
Easy extraction of keywords and engines from search engine results pages (SERPs).
☆92Oct 20, 2025Updated 9 months ago
King-Of-Knights / overcoming-catastrophic
View on GitHub
Implementation of "Overcoming catastrophic forgetting in neural networks" in Keras
☆13Feb 17, 2019Updated 7 years ago
llonchj / scrapy-sentry
View on GitHub
Sentry component for Scrapy
☆84Aug 21, 2023Updated 2 years ago
jsfehler / flake8-multiline-containers
View on GitHub
A Flake8 plugin to ensure a consistent format for multiline containers.
☆14Jun 19, 2026Updated last month