Python script to create CDX index files of WARC data
☆21Sep 4, 2025Updated 9 months ago
Alternatives and similar repositories for CDX-Writer
Users that are interested in CDX-Writer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python script to create CDX index files of WARC data☆16Sep 7, 2018Updated 7 years ago
- Rewrite of Arc 3.1 with more features, more speed, and bug fixes. Still compatible with Arc 3.1.☆49May 1, 2017Updated 9 years ago
- A lispy language that compiles into JavaScript, strongly influenced by Arc.☆14Feb 18, 2011Updated 15 years ago
- A JavaScript port of most of Rainbow (conanite's JVM-based Arc implementation).☆22Feb 1, 2022Updated 4 years ago
- The Seesaw pipeline grab script for the URLTeam (terroroftinytown) project☆28Jul 17, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Tools to analyze web archives☆20Jul 12, 2016Updated 9 years ago
- Editor for New Super Mario Bros. Wii data files☆66Nov 24, 2012Updated 13 years ago
- Demo app built using AngularJS with Backand serving as the back end☆13Mar 1, 2017Updated 9 years ago
- Archive Research Services Workshop☆31Sep 29, 2017Updated 8 years ago
- React components to render differences between captures at the Wayback Machine☆43Jun 3, 2026Updated last week
- Bixo is an open source web mining toolkit that runs as a series of Cascading pipes on top of Hadoop. By building a customized Cascading p…☆143Jul 7, 2022Updated 3 years ago
- Centralised repository for WARC usage specifications.☆128Apr 4, 2026Updated 2 months ago
- An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed…☆161Oct 8, 2025Updated 8 months ago
- Specification for a query language to request Verifiable Presentations from wallets etc.☆10Apr 23, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An Image Dictionary for Co-dfns☆15Jun 16, 2017Updated 8 years ago
- code and data used to build a training dataset for dragnet models☆10Nov 29, 2020Updated 5 years ago
- APL function editor written in APL☆12Mar 9, 2026Updated 3 months ago
- IPLD Schema Implementation: parser and utilities☆16Updated this week
- (Note: This repository is obsolete, please see the new Browsertrix webrecorder/browsertrix) Browser-Based On-Demand Web Archiving Automat…☆38Apr 23, 2019Updated 7 years ago
- wpull fork with fixes and faster parsing using html5-parser; used by grab-site; should go away when wpull is similarly improved☆31Sep 20, 2025Updated 8 months ago
- An IETF specification for cryptographic hyperlinking☆15May 2, 2021Updated 5 years ago
- You've made the list, we'll help you check it twice. Given a domain-like string, verifies inclusion in a list you provide.☆19Nov 13, 2020Updated 5 years ago
- CDXJ Indexing of WARC/ARCs☆34May 11, 2026Updated 3 weeks ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- MIMO platform for advanced communications and PNT applications☆14Dec 8, 2014Updated 11 years ago
- Test whether W3C spec repos match a set of best practices☆21Updated this week
- TypeDB (Core and Cloud) RPC Communication Protocol☆18May 26, 2026Updated 2 weeks ago
- RESTful API documentation for De Lijn☆10Jul 4, 2015Updated 10 years ago
- A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (me…☆14Nov 15, 2021Updated 4 years ago
- Source code for domain classification (scholar or non-scholar) of a web query.☆11May 31, 2016Updated 10 years ago
- A library for HTTPS Everywhere which compiles to WASM☆16Feb 3, 2021Updated 5 years ago
- CI scripts for validating and processing metadata☆11Dec 7, 2019Updated 6 years ago
- Proposed architecture for a Solid server☆13Aug 21, 2020Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Diceware random password generation using the ANU quantum random number server as the randomness source☆17Oct 23, 2018Updated 7 years ago
- Builders for attrs☆11Jul 31, 2019Updated 6 years ago
- Kaitai Struct YAML (KSY) schema specification☆15Sep 12, 2025Updated 8 months ago
- utility to fetch provenance information from Internet Archive's Wayback Machine☆15Feb 5, 2026Updated 4 months ago
- little scripts to introduce people to each other.☆20Jun 25, 2016Updated 9 years ago
- A simple APL neural network.☆11May 11, 2016Updated 10 years ago
- ANNSER is A Neural Network Simulator for Education and Research.☆10Aug 28, 2016Updated 9 years ago