A Rust library for reading and writing WARC files
☆59Nov 27, 2024Updated last year
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below
Sorting:
- CDXJ Indexing of WARC/ARCs☆33Dec 10, 2024Updated last year
- Texting Robots: A Rust native `robots.txt` parser with thorough unit testing☆29Feb 14, 2024Updated 2 years ago
- Tools to Work with the Web Archive Ecosystem in R☆21Aug 20, 2017Updated 8 years ago
- Parse WARC (Web Archive Files) as a node.js stream☆23Oct 20, 2014Updated 11 years ago
- ☆14Mar 20, 2019Updated 7 years ago
- Centralised repository for WARC usage specifications.☆125Oct 12, 2025Updated 5 months ago
- Convert HTTP Archive (HAR) -> Web Archive (WARC) format☆56Oct 21, 2018Updated 7 years ago
- A polite and user-friendly downloader for Common Crawl data☆71Mar 3, 2026Updated 2 weeks ago
- FreeBSD service daemon for KBFS, the Keybase filesystem☆13Jul 22, 2021Updated 4 years ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆113Feb 12, 2025Updated last year
- Scoop by Rusty Foster and the CMF running Kuro5hin and other websites☆12Apr 14, 2017Updated 8 years ago
- Streaming WARC/ARC library for fast web archive IO☆451Dec 10, 2024Updated last year
- Converts HTTrack crawls to WARC files☆34Aug 6, 2024Updated last year
- Supporting example for "A Rust SentencePiece implementation"☆20Jun 7, 2020Updated 5 years ago
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆42Nov 24, 2025Updated 3 months ago
- Web archive index server based on RocksDB☆38Mar 2, 2026Updated 2 weeks ago
- Einstein summation for Rust☆40Apr 8, 2021Updated 4 years ago
- Tests for the Nintendo 3DS's ARM9 security processor☆24Dec 25, 2019Updated 6 years ago
- Command-line tool and Rust library for handling Web ARChive (WARC) files☆28Jun 2, 2025Updated 9 months ago
- A repository to organize materials from the AI4LAM Teach and Learning Working Group☆14May 5, 2023Updated 2 years ago
- A native Rust port of Google's robots.txt parser and matcher C++ library.☆100Feb 13, 2021Updated 5 years ago
- Serve hypercore-archiver over discovery-swarm☆14Apr 19, 2017Updated 8 years ago
- ☆17Apr 19, 2025Updated 11 months ago
- awaits the completion of multiple async tasks☆12Nov 29, 2015Updated 10 years ago
- Task-based Parallelism in Rust☆17Nov 17, 2021Updated 4 years ago
- Pure Elixir disk backed key-value store.☆29Jan 28, 2026Updated last month
- utility to create an element from a simple CSS selector☆13Aug 1, 2023Updated 2 years ago
- A crate built on top of `axum-sessions`, implementing the CSRF Synchronizer Token Pattern☆15Mar 12, 2026Updated last week
- Diff two unist trees☆14Aug 21, 2020Updated 5 years ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆171Aug 18, 2025Updated 7 months ago
- Middleware and Browserify transform for less files☆12Dec 3, 2025Updated 3 months ago
- A regular expression library implemented in Rust.☆37Nov 19, 2015Updated 10 years ago
- A S3 hybrid storage interface for dat and hyperdrive☆13Jul 31, 2018Updated 7 years ago
- Clone of https://git.kernel.org/pub/scm/linux/kernel/git/jejb/sbsigntools.git/ with patches for yubikey support☆10Aug 14, 2020Updated 5 years ago
- Tool and library for handling Web ARChive (WARC) files.☆165Oct 11, 2024Updated last year
- game in rust-lang☆12Feb 19, 2016Updated 10 years ago
- Incredibly hack proof of concept of automatic Rust -> Swig pipeline using procedural macros☆16Jul 18, 2018Updated 7 years ago
- ☆14Mar 2, 2026Updated 2 weeks ago
- pass pages through a pluggable pipeline to extract information from them.☆14Apr 21, 2015Updated 10 years ago