WikiTeam/wikiteam

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WikiTeam/wikiteam)

WikiTeam / wikiteam

Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2026, WikiTeam has preserved more than 600,000 wikis.

☆857

Alternatives and similar repositories for wikiteam

Users that are interested in wikiteam are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

h4ck3rm1k3 / wikiteam
View on GitHub
git svn clone of https://code.google.com/p/wikiteam/
☆13Mar 6, 2016Updated 10 years ago
openzim / mwoffliner
View on GitHub
MediaWiki scraper: all your wiki articles in one highly compressed ZIM file
☆466Updated this week
CommunityDataScienceCollective / COVID-19_Digital_Observatory
View on GitHub
The COVID-19 Digital Observatory collects, aggregates, and distributes data from social media, search engine results, and Wikipedia to su…
☆11Dec 17, 2020Updated 5 years ago
ArchiveTeam / wpull
View on GitHub
Wget-compatible web downloader and crawler.
☆613Apr 29, 2024Updated 2 years ago
ArchiveTeam / grab-site
View on GitHub
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
☆1,602May 23, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
aaronpk / Local-MediaWiki-Sync
View on GitHub
Downloads all pages from a MediaWiki install to local text files.
☆10Jan 23, 2024Updated 2 years ago
WikiApiary / WikiApiary
View on GitHub
Celery-based task workers for collecting and updating data on WikiApiary.
☆31Oct 29, 2015Updated 10 years ago
bibanon / tubeup
View on GitHub
Use yt-dlp to download video/metadata and upload to the Internet Archive.
☆509May 8, 2026Updated 2 months ago
jjjake / internetarchive
View on GitHub
A Python and Command-Line Interface to Archive.org
☆1,887Updated this week
webplatform / mediawiki-conversion
View on GitHub
Convert MediaWiki XML backup into structured raw text file tree
☆16Sep 18, 2015Updated 10 years ago
JustAnotherArchivist / little-things
View on GitHub
The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…
☆24Sep 11, 2020Updated 5 years ago
blucia0a / CTraps-gcc
View on GitHub
Last Writer Slicing: data provenance tracking for concurrent program debugging & analysis
☆13Nov 14, 2014Updated 11 years ago
mediawiki-utilities / python-mediawiki-utilities
View on GitHub
A set of utilities for accessing and processing MediaWiki data.
☆55Jan 15, 2019Updated 7 years ago
iipc / awesome-web-archiving
View on GitHub
An Awesome List for getting started with web archiving
☆2,607Apr 27, 2026Updated 2 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
bayleeadamoss / zazu-chrome-bookmarks
View on GitHub
Chrome bookmark searcher for Zazu.
☆10Apr 26, 2017Updated 9 years ago
ArchiveTeam / ArchiveBot
View on GitHub
ArchiveBot, an IRC bot for archiving websites
☆419Apr 17, 2026Updated 3 months ago
greencardamom / BotWikiAwk
View on GitHub
Framework of tools and libraries for building and running bots on Wikipedia
☆28May 22, 2026Updated 2 months ago
simon987 / awesome-datahoarding
View on GitHub
List of data-hoarding related tools
☆1,328Sep 14, 2023Updated 2 years ago
chfoo / warcat
View on GitHub
Tool and library for handling Web ARChive (WARC) files.
☆165Oct 11, 2024Updated last year
SolidCharity / exportMediaWiki2HTML
View on GitHub
Exporting MediaWiki content to HTML
☆34Sep 30, 2023Updated 2 years ago
miraheze / puppet
View on GitHub
Production Puppet code
☆22Updated this week
iipc / openwayback
View on GitHub
The OpenWayback Development
☆522Jan 3, 2024Updated 2 years ago
webrecorder / replayweb.page
View on GitHub
Serverless replay of web archives directly in the browser
☆965Jul 13, 2026Updated last week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ArchiveBox / ArchiveBox
View on GitHub
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and mor…
☆28,002Updated this week
Wikidata / WikibaseImport
View on GitHub
Import entities from another Wikibase instance (e.g. Wikidata)
☆13May 21, 2023Updated 3 years ago
hartator / wayback-machine-downloader
View on GitHub
Download an entire website from the Wayback Machine.
☆5,910Feb 8, 2024Updated 2 years ago
PeterBodifee / MediaWiki-AWS
View on GitHub
Deploy MediaWiki on AWS using Elastic Beanstalk
☆10Oct 19, 2017Updated 8 years ago
ProfessionalWiki / SemanticWikibase
View on GitHub
Makes Wikibase data available in Semantic MediaWiki
☆18May 10, 2026Updated 2 months ago
MattCreative / chrome-bookmarks-converter
View on GitHub
A script for converting Chrome bookmark.bak files to Chrome Bookmark.html files so you can import your bookmarks from your AppData file.
☆10Jun 6, 2015Updated 11 years ago
internetarchive / warcprox
View on GitHub
WARC writing MITM HTTP/S proxy
☆456Jun 17, 2026Updated last month
amanurs / END-OF-THE-WORLD
View on GitHub
END OF THE WORLD
☆11Mar 12, 2020Updated 6 years ago
iipc / warc-specifications
View on GitHub
Centralised repository for WARC usage specifications.
☆129Apr 4, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ArchiveTeam / reddit-grab
View on GitHub
Grabbing everything from reddit.
☆62Feb 16, 2024Updated 2 years ago
ArchiveTeam / universal-tracker
View on GitHub
A configurable, reusable tracker with dashboard
☆36Dec 15, 2023Updated 2 years ago
alard / wget-lua
View on GitHub
Wget with Lua extension
☆24Dec 17, 2015Updated 10 years ago
gnosygnu / xowa
View on GitHub
xowa offline wiki application
☆421Feb 25, 2022Updated 4 years ago
internetarchive / heritrix3
View on GitHub
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
☆3,284Jul 15, 2026Updated last week
iipc / warc2html
View on GitHub
Converts WARC files to static HTML
☆59Sep 18, 2025Updated 10 months ago
JOSM / areaselector
View on GitHub
JOSM Area Selection Plugin
☆19May 7, 2026Updated 2 months ago