jedireza / warcLinks
A Rust library for reading and writing WARC files
☆57Updated last year
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below
Sorting:
- Fast English word segmentation in Rust☆101Updated 2 weeks ago
- Spelling correction & Fuzzy search based on Symmetric Delete spelling correction algorithm.☆140Updated 6 months ago
- ☆51Updated 3 years ago
- ☆68Updated 2 years ago
- Fast hierarchical agglomerative clustering in Rust.☆104Updated 9 months ago
- [UNMAINTAINED] A transactional and deduplicating virtual file system☆97Updated last year
- Fast approximate nearest neighbor searching in Rust, based on HNSW index☆338Updated 2 weeks ago
- Rust client for txtai☆113Updated 2 weeks ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆113Updated 10 months ago
- A vectorized JSON parser for pre-validated, minified documents☆85Updated last year
- Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.☆150Updated 3 weeks ago
- A WHATWG-compliant HTML5 tokenizer and tag soup parser☆166Updated last month
- Small-scale process orchestration☆65Updated 3 years ago
- Harmonious distributed data analysis in Rust.☆21Updated 4 years ago
- An in-memory filesystem cache layer based on tokio::fs, with least-frequently-used eviction☆38Updated 10 months ago
- A distributed private application framework☆61Updated 10 months ago
- A Rust implementation of fractional indexing.☆65Updated last year
- Data visualisation library, written in Rust☆29Updated 5 years ago
- Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)☆124Updated last year
- Fast item-to-item recommendations on the command line.☆38Updated 3 years ago
- Native Rust port of Google's HighwayHash, which makes use of SIMD instructions for a fast and strong hash function☆174Updated 4 months ago
- A collection of small notes that aren't appropriate for my blog.☆32Updated 3 years ago
- Simple NLP in Rust with Python bindings☆153Updated 2 years ago
- finalfusion embeddings in Rust☆105Updated 2 years ago
- Simple string matching with single- and multiple-wildcard operator☆94Updated last month
- Rust wrapper for the BlingFire tokenization library☆16Updated 5 years ago
- A crate implementing a synchronized map for memoization☆30Updated 11 months ago
- Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly☆226Updated last month
- Proxy for turning web browsers into web servers. Load a 100GB file in your browser and stream it over the public web with HTTP byte range…☆103Updated 2 years ago
- Locality Sensitive Hashing in Rust with Python bindings☆120Updated 2 years ago