Python script to create CDX index files of WARC data
☆21Sep 4, 2025Updated 7 months ago
Alternatives and similar repositories for CDX-Writer
Users that are interested in CDX-Writer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The Seesaw pipeline grab script for the URLTeam (terroroftinytown) project☆28Jul 17, 2025Updated 9 months ago
- Tools to analyze web archives☆20Jul 12, 2016Updated 9 years ago
- Archive Research Services Workshop☆31Sep 29, 2017Updated 8 years ago
- An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed…☆158Oct 8, 2025Updated 6 months ago
- code and data used to build a training dataset for dragnet models☆10Nov 29, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- IPLD Schema Implementation: parser and utilities☆16Updated this week
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆132Nov 21, 2025Updated 5 months ago
- A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (me…☆14Nov 15, 2021Updated 4 years ago
- avahi in a container☆16Aug 28, 2020Updated 5 years ago
- Kaitai Struct YAML (KSY) schema specification☆15Sep 12, 2025Updated 7 months ago
- JavaScript wrapper for swfobject that detects different flashblock extensions in Chrome, Firefox, Opera and Safari☆19Jun 27, 2012Updated 13 years ago
- Decentralized web Gateway for Internet Archive☆21Jan 4, 2020Updated 6 years ago
- Search engine for every Super Mario Maker 2 level in the world☆17Dec 13, 2023Updated 2 years ago
- Bitcoin Hush☆12Apr 29, 2020Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Create, manage and edit your audio book library from the command line.☆10Oct 20, 2024Updated last year
- Web archiving using Google Chrome☆45Dec 30, 2019Updated 6 years ago
- We have moved!☆10Mar 29, 2016Updated 10 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆65Aug 13, 2025Updated 8 months ago
- Gitignore sample for Cocoa projects☆10Mar 12, 2011Updated 15 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Aug 10, 2023Updated 2 years ago
- A simple macOS app to create valid file and url names from clipboard text. #pypackage☆53Dec 30, 2025Updated 4 months ago
- Automatically configure Wireguard interfaces in distributed system. It supports Consul as backend.☆11Mar 21, 2020Updated 6 years ago
- Convert Directories, Files and ZIP Files to Web Archives (WARC)☆97Apr 22, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆11Mar 1, 2025Updated last year
- Archiving URLs (outlinks) from a variety of sources.☆25Mar 27, 2026Updated last month
- Backup of forum.xentax.com. WIP, stuff will be broken, etc.☆12Nov 2, 2023Updated 2 years ago
- Bash script to monitor a source directory and move any added objects to a target directory, ensuring via fail-safe handling of data.☆12Dec 3, 2016Updated 9 years ago
- Continuous unit testing tool☆25Oct 15, 2020Updated 5 years ago
- A space invaders arcade cabinet frontend for the 21st century 8080 microprocessor☆10Mar 14, 2023Updated 3 years ago
- ☆11May 26, 2023Updated 2 years ago
- archiving community contributions on YouTube: unpublished captions, title and description translations and caption credits☆11Oct 29, 2020Updated 5 years ago
- My Presentations in PDF☆29Apr 20, 2016Updated 10 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Artificial Inteligence/Machine Learning for programming LEGO with Pybricks☆17Feb 22, 2026Updated 2 months ago
- Bug Tracker for the classic SourceForts HL2 Mod☆10Oct 15, 2018Updated 7 years ago
- Library of standard user interface components☆34Dec 21, 2011Updated 14 years ago
- Personal collection of Dagger modules☆12Jan 15, 2026Updated 3 months ago
- A minimal package for saving and reading large HDF5-based chunked arrays.☆15Apr 12, 2022Updated 4 years ago
- A commandline tool that wraps the Archerysec REST API for controlling Archery and executing quick, targeted scans.☆11May 30, 2024Updated last year
- Space Battle Game made in Unity as part of an Edinburgh Napier Group project.☆10Jun 21, 2020Updated 5 years ago