TeamHG-Memex/autopager

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TeamHG-Memex/autopager)

TeamHG-Memex / autopager

Detect and classify pagination links

☆107

Alternatives and similar repositories for autopager

Users that are interested in autopager are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TeamHG-Memex / MaybeDont
View on GitHub
A component that tries to avoid downloading duplicate content
☆28Apr 8, 2026Updated 3 months ago
TeamHG-Memex / Formasaurus
View on GitHub
Formasaurus tells you the type of an HTML form and its fields using machine learning
☆121Apr 8, 2026Updated 3 months ago
TeamHG-Memex / soft404
View on GitHub
A classifier for detecting soft 404 pages
☆65Apr 8, 2026Updated 3 months ago
TeamHG-Memex / deep-deep
View on GitHub
Adaptive crawler which uses Reinforcement Learning methods
☆167Apr 8, 2026Updated 3 months ago
TeamHG-Memex / extract-html-diff
View on GitHub
extract difference between two html pages
☆33Apr 8, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
scrapinghub / scrapy-mosquitera
View on GitHub
Restrict crawl and scraping scope using matchers.
☆26Jun 8, 2016Updated 10 years ago
seagatesoft / webdext
View on GitHub
Intelligent Web Data Extractor
☆74Dec 5, 2022Updated 3 years ago
TeamHG-Memex / autologin-middleware
View on GitHub
Scrapy middleware for the autologin
☆36Apr 8, 2026Updated 3 months ago
scrapinghub / andi
View on GitHub
Library for annotation-based dependency injection
☆24Jul 21, 2026Updated last week
TeamHG-Memex / autologin
View on GitHub
A project to attempt to automatically login to a website given a single seed
☆129Apr 8, 2026Updated 3 months ago
redapple / parslepy
View on GitHub
Python implementation of the Parsley language for extracting structured data from web pages
☆92Oct 26, 2017Updated 8 years ago
scrapinghub / product-extraction-benchmark
View on GitHub
☆16Apr 10, 2026Updated 3 months ago
scrapinghub / webstruct
View on GitHub
NER toolkit for HTML data
☆259May 3, 2024Updated 2 years ago
TeamHG-Memex / html-text
View on GitHub
Extract text from HTML
☆135Apr 8, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
rmax / databrewer
View on GitHub
The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!
☆41May 29, 2017Updated 9 years ago
xtannier / WebAnnotator
View on GitHub
WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…
☆48Dec 17, 2021Updated 4 years ago
TeamHG-Memex / undercrawler
View on GitHub
A generic crawler
☆81Apr 8, 2026Updated 3 months ago
scrapinghub / js2xml
View on GitHub
Convert Javascript code to an XML document
☆188Mar 14, 2022Updated 4 years ago
realslimshanky / Spider-Sense
View on GitHub
A browser extension to monitor your spiders deployed on Scrapy Cloud.
☆16Mar 8, 2025Updated last year
scrapinghub / aile
View on GitHub
Automatic Item List Extraction
☆85Jun 15, 2016Updated 10 years ago
matthewruttley / mozclassify
View on GitHub
Algorithms for URL Classification
☆19Apr 13, 2015Updated 11 years ago
llonchj / scrapy-sentry
View on GitHub
Sentry component for Scrapy
☆84Aug 21, 2023Updated 2 years ago
scrapinghub / autopager
View on GitHub
Detect and classify pagination links
☆15Sep 9, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TeamHG-Memex / sitehound-frontend
View on GitHub
Site Hound (previously THH) is a Domain Discovery Tool
☆24Apr 8, 2026Updated 3 months ago
stummjr / scrapy-fieldstats
View on GitHub
A Scrapy extension to log items coverage when the spider shuts down
☆18Apr 11, 2020Updated 6 years ago
rmax / scrapydo
View on GitHub
Crochet-based blocking API for Scrapy.
☆47Feb 24, 2017Updated 9 years ago
zytedata / html-text
View on GitHub
☆20Oct 6, 2025Updated 9 months ago
TeamHG-Memex / tor-proxy
View on GitHub
a tor socks proxy docker image
☆12Apr 8, 2026Updated 3 months ago
scrapinghub / number-parser
View on GitHub
Parse numbers written in natural language
☆130Oct 23, 2024Updated last year
scrapy-plugins / scrapy-splitvariants
View on GitHub
Scrapy spider middleware to split an item into multiple items using a multi-valued key
☆21Feb 8, 2017Updated 9 years ago
skytreader / CleverAlgorithms-Python
View on GitHub
The Clever Algorithms project is an effort to describe a large number of algorithmic techniques from the field of Artificial Intelligence…
☆29Oct 28, 2018Updated 7 years ago
scrapinghub / price-parser
View on GitHub
Extract price amount and currency symbol from a raw text string
☆346Mar 19, 2026Updated 4 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
scrapinghub / spidermon
View on GitHub
Scrapy Extension for monitoring spiders execution.
☆562May 28, 2026Updated 2 months ago
scrapy / w3lib
View on GitHub
Python library of web-related functions
☆419Updated this week
sebcrozet / rs2cl
View on GitHub
Write OpenCL kernels in rust.
☆12Sep 28, 2013Updated 12 years ago
croqaz / awesome-scrapy
View on GitHub
🕶 Awesome list of Scrapy tools and libraries
☆60Jul 6, 2020Updated 6 years ago
kootenpv / deep_eye2mouse
View on GitHub
Move the mouse by your webcam + eyes
☆21Oct 24, 2017Updated 8 years ago
scrapinghub / mdr
View on GitHub
A python library detect and extract listing data from HTML page.
☆110May 5, 2017Updated 9 years ago
scrapinghub / skinfer
View on GitHub
Skinfer is a tool for inferring and merging JSON schemas
☆141Apr 24, 2024Updated 2 years ago