internetarchive / crawling-for-nomore404Links
☆27Updated last month
Alternatives and similar repositories for crawling-for-nomore404
Users that are interested in crawling-for-nomore404 are comparing it to the libraries listed below
Sorting:
- A fun tool for quickly browsing unsourced snippets on Wikipedia.☆111Updated this week
 - URLTeam's second generation of URL shortener archiving tools☆79Updated last month
 - ☆143Updated last week
 - Saving all questions and answers from Yahoo! Answers.☆50Updated 4 years ago
 - A command line tool to archive a git repository from GitHub to the Internet Archive.☆91Updated 4 years ago
 - Github mirror of "analytics/quarry/web" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_acce…☆44Updated 3 years ago
 - Dynamic ToS;DR CMS, used in our frontpage☆51Updated 11 months ago
 - ☆77Updated last week
 - A copyright violation detector running on Wikimedia Cloud Services☆45Updated 10 months ago
 - Perpetual Access To The Scholarly Record☆119Updated last year
 - Archiving GitHub☆10Updated 2 months ago
 - 🔎 Did you know most GitHub Wikis can't index on search engines? Search Engine Enablement for GitHub Wikis service. 400,000+ GitHub Wikis…☆123Updated last month
 - Official Python package for ArchiveBox, the self-hosted internet archiving solution.☆13Updated last year
 - A timezone converter for online events☆15Updated last year
 - Web-based whois gateway written in Python for lighttpd☆26Updated 10 months ago
 - This repository has been moved to GitLab: https://gitlab.wikimedia.org/repos/ci-tools/patchdemo☆26Updated 2 years ago
 - Wikipedia tool that expands bare references☆54Updated 2 weeks ago
 - Citation bot is a tool to expand and format references at Wikipedia. It retrieves citation data from a variety of sources including Cross…☆65Updated this week
 - My collection of scripts that can be used on MediaWiki sites such as Wikipedia.☆12Updated last month
 - A Memento Aggregator CLI and Server in Go☆70Updated 7 months ago
 - Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆27Updated last year
 - A Memento Client Library in Python☆26Updated 7 years ago
 - Web archive index server based on RocksDB☆36Updated this week
 - Wikipedia 1.0 engine & selection tools☆40Updated last week
 - Wombat.js client-side rewriting library☆107Updated this week
 - A Memento TimeGate☆44Updated 5 years ago
 - Tool to import files from the Internet Archive to Wikimedia Commons.☆18Updated 2 weeks ago
 - The repo for the PetScan tool☆57Updated 3 weeks ago
 - React components to render differences between captures at the Wayback Machine☆35Updated 6 months ago
 - A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆188Updated last year