jedireza / warcLinks
A Rust library for reading and writing WARC files
☆55Updated 7 months ago
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below
Sorting:
- Spelling correction & Fuzzy search based on Symmetric Delete spelling correction algorithm.☆137Updated last month
- Fast English word segmentation in Rust☆99Updated 3 weeks ago
- ☆66Updated 2 years ago
- Common stop words in a variety of languages☆21Updated 4 months ago
- ☆47Updated 2 years ago
- Fast hierarchical agglomerative clustering in Rust.☆100Updated 3 months ago
- A vectorized JSON parser for pre-validated, minified documents☆83Updated 11 months ago
- Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.☆140Updated last year
- Rust client for txtai☆111Updated last month
- Fast item-to-item recommendations on the command line.☆38Updated 2 years ago
- Fast approximate nearest neighbor searching in Rust, based on HNSW index☆330Updated last week
- finalfusion embeddings in Rust☆102Updated last year
- Media file metadata for human consumption☆52Updated 7 months ago
- [UNMAINTAINED] A transactional and deduplicating virtual file system☆97Updated last year
- Rust implementation of JMESPath, a query language for JSON☆139Updated last week
- A downloader for different live stream providers☆24Updated last year
- Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)☆117Updated 10 months ago
- Nuts is a Rust library that offers a simple publish-subscribe API, featuring decoupled creation of the publisher and the subscriber.☆68Updated 4 years ago
- Port of arc90labs-readability with rust☆129Updated last year
- 🗄️ A simple CLI for converting WARC to Parquet.☆111Updated 5 months ago
- Straightforward password, passphrase, TOTP, and HOTP user authentication☆58Updated 5 years ago
- Dachshund is a graph mining library written in Rust. It provides high performance data structures for multiple kinds of graphs, from simp…☆90Updated last year
- Dynamic transformation of data using serde serializable, deserialize using JSON and a JSON transformation syntax similar to Javascript JS…☆16Updated 3 years ago
- A free file hosting server that focuses on speed, reliability and security.☆102Updated last year
- Rust implementation of Duckling☆78Updated 3 years ago
- A Rust library to generate URL, Index, Image, Video, and News sitemaps.☆25Updated 4 months ago
- A Rust implementation of fractional indexing.☆57Updated 10 months ago
- Web analytics focusing on privacy and simplicity.☆96Updated last year
- The fastest and lightest mail parsing Rust library.☆177Updated 2 years ago
- A WHATWG-compliant HTML5 tokenizer and tag soup parser☆161Updated 4 months ago