jedireza / warcLinks
A Rust library for reading and writing WARC files
☆56Updated 11 months ago
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below
Sorting:
- Fast English word segmentation in Rust☆101Updated last month
- Spelling correction & Fuzzy search based on Symmetric Delete spelling correction algorithm.☆140Updated 5 months ago
- Fast approximate nearest neighbor searching in Rust, based on HNSW index☆337Updated 3 weeks ago
- ☆67Updated 2 years ago
- Fast hierarchical agglomerative clustering in Rust.☆102Updated 7 months ago
- A vectorized JSON parser for pre-validated, minified documents☆85Updated last year
- Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.☆147Updated 2 months ago
- A command line tool to rename media files based on titles from IMDb.☆238Updated last year
- Rust client for txtai☆112Updated 2 weeks ago
- ☆51Updated 3 years ago
- The fastest and lightest mail parsing Rust library.☆179Updated 2 years ago
- Rust implementation of JMESPath, a query language for JSON☆148Updated 4 months ago
- A high-performance, cross-platform file reverse utility☆112Updated last year
- A WHATWG-compliant HTML5 tokenizer and tag soup parser☆164Updated 2 weeks ago
- Port of arc90labs-readability with rust☆132Updated last year
- [UNMAINTAINED] A transactional and deduplicating virtual file system☆97Updated last year
- A collection of small notes that aren't appropriate for my blog.☆32Updated 3 years ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆113Updated 9 months ago
- A crate implementing a synchronized map for memoization☆30Updated 10 months ago
- Coppers is a custom test harnass for Rust that measures the energy usage of your test suite.☆185Updated 3 years ago
- A Rust implementation of fractional indexing.☆61Updated last year
- Native Rust port of Google's HighwayHash, which makes use of SIMD instructions for a fast and strong hash function☆172Updated 3 months ago
- Dachshund is a graph mining library written in Rust. It provides high performance data structures for multiple kinds of graphs, from simp…☆91Updated 2 years ago
- Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly☆225Updated 5 months ago
- Cross-platform embeddable sandboxing☆198Updated 3 months ago
- Rust implementation of Simhash☆24Updated 2 years ago
- Cargo subcommand for downloading crates directly from crates.io☆28Updated 4 years ago
- 🃏 A distributed unique ID generator inspired by Twitter's Snowflake.☆190Updated last month
- Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)☆122Updated last year
- finalfusion embeddings in Rust☆104Updated 2 years ago