jedireza / warcLinks
A Rust library for reading and writing WARC files
☆56Updated 9 months ago
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below
Sorting:
- Spelling correction & Fuzzy search based on Symmetric Delete spelling correction algorithm.☆139Updated 2 months ago
- Fast English word segmentation in Rust☆100Updated 2 months ago
- A vectorized JSON parser for pre-validated, minified documents☆83Updated last year
- A WHATWG-compliant HTML5 tokenizer and tag soup parser☆162Updated 2 weeks ago
- Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.☆145Updated this week
- Fast item-to-item recommendations on the command line.☆38Updated 2 years ago
- ☆48Updated 2 years ago
- The fastest and lightest mail parsing Rust library.☆179Updated 2 years ago
- [UNMAINTAINED] A transactional and deduplicating virtual file system☆98Updated last year
- Fast approximate nearest neighbor searching in Rust, based on HNSW index☆330Updated last month
- Rust implementation of JMESPath, a query language for JSON☆140Updated last month
- ☆67Updated 2 years ago
- Fast hierarchical agglomerative clustering in Rust.☆103Updated 4 months ago
- A collection of small notes that aren't appropriate for my blog.☆32Updated 3 years ago
- Rust client for txtai☆110Updated this week
- 🃏 A distributed unique ID generator inspired by Twitter's Snowflake.☆187Updated 2 weeks ago
- Simple string matching with single- and multiple-wildcard operator☆87Updated 11 months ago
- A Rust library to generate URL, Index, Image, Video, and News sitemaps.☆27Updated this week
- Cargo subcommand for downloading crates directly from crates.io☆29Updated 3 years ago
- A crate implementing a synchronized map for memoization☆31Updated 7 months ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆112Updated 6 months ago
- A high-performance, cross-platform file reverse utility☆112Updated last year
- A Rust implementation of fractional indexing.☆58Updated 11 months ago
- Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly☆222Updated 2 months ago
- Native Rust port of Google's HighwayHash, which makes use of SIMD instructions for a fast and strong hash function☆170Updated 3 weeks ago
- A pure Rust implementation of the Minisign signature tool.☆100Updated 2 months ago
- Lightweight FST-based autocompleter library written in Rust, targeting WebAssembly and data stored in-memory☆32Updated 2 years ago
- A convenient block copy program.☆29Updated 6 years ago
- Highly flexible library to manage and orchestrate JWT workflow☆66Updated 5 years ago
- Context-sensitive word embeddings with subwords. In Rust.☆87Updated last year