dlenski / wp2git
Downloads and imports Wikipedia page histories to a git repository
☆34Updated 3 months ago
Alternatives and similar repositories for wp2git:
Users that are interested in wp2git are comparing it to the libraries listed below
- Adds a reconciliation API endpoint to Datasette, based on the Reconciliation Service API specification.☆24Updated last year
- Web interface for searching your code using ripgrep, built as a Datasette plugin☆74Updated last year
- Misspelled Words In Context☆38Updated 3 weeks ago
- Command line tool for digging into WARC files☆38Updated this week
- A tool that democratizes and standardizes access to Web APIs.☆12Updated 2 years ago
- A tool for collection archival slivers of the web and web archives☆12Updated last month
- Platform for journalists to search, analyse, categorise and share unstructured data☆54Updated last month
- Document your SQLite tables and columns with in-line comments☆24Updated last year
- Static Site Generator for Viewing Web Archives (in WACZ) format☆22Updated last year
- Searchable Linkable Open Public Indexed (SLOPI) Communication☆19Updated 2 years ago
- The goal of this project is to create an open source map of accessibles pianos. Data will be hosted on OpenStreetMap☆16Updated 2 years ago
- Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives☆14Updated 3 years ago
- Datasette plugin for outputting iCalendar files☆23Updated 2 years ago
- Save data from Mastodon to a SQLite database☆28Updated 9 months ago
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆51Updated 6 years ago
- Timelens command-line client☆59Updated last year
- The god of human readable numbers☆12Updated 5 years ago
- Datasette plugin for rendering Markdown☆29Updated last year
- CDXJ Indexing of WARC/ARCs☆25Updated 3 months ago
- Generates large collages of images using OpenSeadragon☆48Updated 11 months ago
- Comparing warc files☆17Updated 6 years ago
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆25Updated 7 months ago
- Updates Wikidata entries using metadata from github☆45Updated 2 months ago
- Trough: Big data, small databases.☆40Updated 8 months ago
- recursively deduplicate a directory and write its contents to a new directory while remembering the old paths☆48Updated 4 years ago
- Datasette plugin providing a UI for executing SQL writes against the database☆10Updated 6 months ago
- Pull out versions of specific files from a gitscraping repo into individual files.☆15Updated 3 years ago
- Simple command line tool for quickly analysing the structure of an arbitrary XML file☆31Updated last year
- DocumentCloud's back end source code - Please report bugs, issues and feature requests to info@documentcloud.org☆37Updated last week
- Datasette plugin for outputting tables in formats suitable for copy and paste☆16Updated last year