benhoyt / soft404Links
Soft 404 (dead page) detector in Python
☆13Updated 6 years ago
Alternatives and similar repositories for soft404
Users that are interested in soft404 are comparing it to the libraries listed below
Sorting:
- ██████╗ ███████╗██████╗ ██╔══██╗██╔════╝██╔══██╗ ██████╔╝█████╗ ██║ ██║ ██╔══██╗██╔══╝ ██║ ██║ ██║ ██║███████╗██████╔╝ ╚═╝ ╚═╝╚═══…☆11Updated 3 years ago
- ☆30Updated last year
- 404 Error Page - Astronaut☆21Updated 5 years ago
- The tech404.github.io website☆16Updated 6 months ago
- 404Games Wastelands V2 - Chernarus☆22Updated 12 years ago
- CMPUT404-assignment-websockets.☆10Updated 11 years ago
- Shepherding our web archives from crawl to access.☆10Updated last year
- CMPUT404-project-socialdistribution☆14Updated 2 years ago
- CMPUT404-assignment-ajax☆11Updated 11 years ago
- COSC 404 - Database System Implementation☆29Updated 8 months ago
- ☆14Updated last year
- Social Feed Manager user interface application.☆156Updated last year
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆128Updated last month
- Discord bot by Sanich for https://youtu.be/1lzPIhTaPDY☆13Updated 3 years ago
- Codecademy's 404 page! ✨☆43Updated 2 years ago
- The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.☆146Updated last year
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆27Updated last year
- This repository contains tool and collections dataset for detecting off-topic pages from Web archived collections.☆18Updated 10 years ago
- Conifer setup and deployment via Ansible☆12Updated 5 years ago
- A javascript for Malaysia 404 page☆11Updated 5 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- A LevelDB backed URL unshortening microservice written in JavaScript☆31Updated 2 years ago
- An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed…☆152Updated last month
- Perpetual Access To The Scholarly Record☆120Updated last year
- Streaming WARC/ARC library for fast web archive IO☆429Updated 8 months ago
- A design prototype for DocNow to learn with☆14Updated 8 years ago
- Humanities Entity Recognition: robust, practical, efficient Named Entity Recognition for today's digital humanist☆37Updated 6 years ago
- Common Crawl fork of Apache Nutch☆35Updated last week
- WASAPI data transfer APIs☆47Updated 3 years ago
- A commandline tool and Python library for archiving data from Facebook using the Graph API.☆78Updated 7 years ago