jedireza / warcLinks
A Rust library for reading and writing WARC files
☆56Updated 8 months ago
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below
Sorting:
- Spelling correction & Fuzzy search based on Symmetric Delete spelling correction algorithm.☆139Updated last month
- ☆48Updated 2 years ago
- Fast English word segmentation in Rust☆99Updated last month
- Rust implementation of JMESPath, a query language for JSON☆140Updated last month
- Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.☆144Updated last year
- A vectorized JSON parser for pre-validated, minified documents☆83Updated last year
- ☆66Updated 2 years ago
- Fast approximate nearest neighbor searching in Rust, based on HNSW index☆330Updated last month
- Fast hierarchical agglomerative clustering in Rust.☆102Updated 4 months ago
- A Rust implementation of fractional indexing.☆57Updated 10 months ago
- A WHATWG-compliant HTML5 tokenizer and tag soup parser☆161Updated 5 months ago
- Native Rust port of Google's HighwayHash, which makes use of SIMD instructions for a fast and strong hash function☆166Updated 2 weeks ago
- Fast item-to-item recommendations on the command line.☆38Updated 2 years ago
- [UNMAINTAINED] A transactional and deduplicating virtual file system☆98Updated last year
- Straightforward password, passphrase, TOTP, and HOTP user authentication☆58Updated 5 years ago
- The fastest and lightest mail parsing Rust library.☆177Updated 2 years ago
- Simple string matching with single- and multiple-wildcard operator☆87Updated 10 months ago
- Media file metadata for human consumption☆52Updated 8 months ago
- Common stop words in a variety of languages☆22Updated 5 months ago
- Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly☆222Updated 2 months ago
- Web analytics focusing on privacy and simplicity.☆96Updated last year
- 🗄️ A simple CLI for converting WARC to Parquet.☆112Updated 5 months ago
- Port of arc90labs-readability with rust☆129Updated last year
- Dachshund is a graph mining library written in Rust. It provides high performance data structures for multiple kinds of graphs, from simp…☆90Updated last year
- Rust wrapper for the BlingFire tokenization library☆15Updated 5 years ago
- A downloader for different live stream providers☆24Updated last year
- Dynamic transformation of data using serde serializable, deserialize using JSON and a JSON transformation syntax similar to Javascript JS…☆16Updated 3 years ago
- A command line tool to rename media files based on titles from IMDb.☆236Updated 10 months ago
- A task orchestrator using redis, written in rust☆71Updated 2 years ago
- Rust client for txtai☆111Updated 2 months ago