jedireza / warc
A Rust library for reading and writing WARC files
☆50Updated 2 months ago
Alternatives and similar repositories for warc:
Users that are interested in warc are comparing it to the libraries listed below
- Fast hierarchical agglomerative clustering in Rust.☆94Updated last year
- ☆45Updated 2 years ago
- ☆63Updated last year
- Spelling correction & Fuzzy search based on Symmetric Delete spelling correction algorithm.☆132Updated 3 weeks ago
- Fast item-to-item recommendations on the command line.☆37Updated 2 years ago
- Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.☆135Updated 10 months ago
- Rust implementation of JMESPath, a query language for JSON☆137Updated 4 months ago
- Rust helpers for conditional GET, HEAD, byte range serving, and gzip content encoding for static files and more with hyper and tokio.☆34Updated this week
- A Rust implementation of fractional indexing.☆55Updated 5 months ago
- A collection of small notes that aren't appropriate for my blog.☆32Updated 2 years ago
- Dynamic transformation of data using serde serializable, deserialize using JSON and a JSON transformation syntax similar to Javascript JS…☆16Updated 3 years ago
- Common stop words in a variety of languages☆19Updated last month
- Fast English word segmentation in Rust☆96Updated last month
- A vectorized JSON parser for pre-validated, minified documents☆83Updated 6 months ago
- finalfusion embeddings in Rust☆95Updated last year
- Lightweight FST-based autocompleter library written in Rust, targeting WebAssembly and data stored in-memory☆32Updated last year
- Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly☆211Updated 3 months ago
- Native Rust port of Google's HighwayHash, which makes use of SIMD instructions for a fast and strong hash function☆160Updated 3 weeks ago
- A lightweight full-text search library written in Rust that provides full control over the scoring calculations☆66Updated 7 months ago
- Context-sensitive word embeddings with subwords. In Rust.☆87Updated last year
- Multilingual implementation of RAKE algorithm for Rust☆33Updated this week
- [UNMAINTAINED] A transactional and deduplicating virtual file system☆96Updated 11 months ago
- A globbing library for Rust.☆42Updated last year
- 🗄️ A simple CLI for converting WARC to Parquet.☆108Updated last week
- Advisory cross-platform file locks using file descriptors☆76Updated last year
- Signed code reviews for Python packages.☆23Updated 2 years ago
- A dictionary of jargon and tropes around the community of Rust developers.☆49Updated 2 years ago
- Implementation of the Punkt sentence tokenizing algorithm in Rust.☆35Updated 5 years ago
- A downloader for different live stream providers☆23Updated 7 months ago
- Highly flexible library to manage and orchestrate JWT workflow☆67Updated 4 years ago