A Rust library for reading and writing WARC files
☆59Nov 27, 2024Updated last year
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- C++ library to parse WARC files☆11Jan 27, 2019Updated 7 years ago
- Use JSX as an Eleventy template language☆20May 29, 2023Updated 2 years ago
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆55Oct 21, 2018Updated 7 years ago
- Convert Directories, Files and ZIP Files to Web Archives (WARC)☆97Apr 22, 2025Updated last year
- A polite and user-friendly downloader for Common Crawl data☆79Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- FreeBSD service daemon for KBFS, the Keybase filesystem☆13Jul 22, 2021Updated 4 years ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆114Feb 12, 2025Updated last year
- Streaming WARC/ARC library for fast web archive IO☆455Apr 6, 2026Updated 3 weeks ago
- 🧩 Proposal to allow user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser en…☆19Jul 11, 2025Updated 9 months ago
- Wombat.js client-side rewriting library☆118Updated this week
- A dockerized, queued high fidelity web archiver based on Squidwarc☆62Jul 9, 2024Updated last year
- IFIscripts is an open-source digital preservation tool which facilitates collection management workflows within the IFI and further afiel…☆31Nov 19, 2025Updated 5 months ago
- An arithmetic coder for Rust.☆23May 24, 2023Updated 2 years ago
- A small library for building fast and highly customizable web crawlers☆16Jan 4, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Einstein summation for Rust☆40Apr 8, 2021Updated 5 years ago
- Read and write WARC files in Go☆50Apr 13, 2026Updated 2 weeks ago
- Index Filesystem for FUSE☆17Dec 15, 2021Updated 4 years ago
- A Rust crate for helping write structs as binary data using ✨macro magic✨☆18Apr 17, 2020Updated 6 years ago
- Command-line tool and Rust library for handling Web ARChive (WARC) files☆30Jun 2, 2025Updated 10 months ago
- A repository to organize materials from the AI4LAM Teach and Learning Working Group☆14May 5, 2023Updated 2 years ago
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆132Nov 21, 2025Updated 5 months ago
- A typed Rust library for easily interacting with and consuming the Bluesky Jetstream service.☆51Apr 10, 2025Updated last year
- A prototype server to swarm multiple DATs for Webrecorder☆14Apr 27, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆48Mar 19, 2018Updated 8 years ago
- A framework for creating digital exhibits by loading collection metadata directly from a CSV (such as a published Google Sheet!). See the…☆14Feb 20, 2026Updated 2 months ago
- Task-based Parallelism in Rust☆17Nov 17, 2021Updated 4 years ago
- utility to create an element from a simple CSS selector☆13Aug 1, 2023Updated 2 years ago
- A crate built on top of `axum-sessions`, implementing the CSRF Synchronizer Token Pattern☆15Updated this week
- Webrecorders DevTools Protocol Automation Library☆18Oct 18, 2022Updated 3 years ago
- Internet Archive's Sparkling Data Processing Library☆16Mar 3, 2026Updated last month
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆173Aug 18, 2025Updated 8 months ago
- File Manager built with egui in Rust☆20Apr 14, 2026Updated 2 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A S3 hybrid storage interface for dat and hyperdrive☆13Jul 31, 2018Updated 7 years ago
- Chrome Debugging Protocol interface for python asyncio☆14Oct 31, 2020Updated 5 years ago
- Narwhal is a keyword and KEY NARRATIVE manager that creates language-aware classes. Because Narhwal does not use NLP it avoids complexity…☆12Oct 16, 2018Updated 7 years ago
- Incredibly hack proof of concept of automatic Rust -> Swig pipeline using procedural macros☆17Jul 18, 2018Updated 7 years ago
- WarcDB: Web crawl data as SQLite databases.☆404Jul 13, 2024Updated last year
- Build Amazon Simple Queue Service (SQS) based applications without the boilerplate☆10Apr 6, 2023Updated 3 years ago
- The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.☆156Dec 5, 2025Updated 4 months ago