jedireza / warcLinks
A Rust library for reading and writing WARC files
☆56Updated 10 months ago
Alternatives and similar repositories for warc
Users that are interested in warc are comparing it to the libraries listed below
Sorting:
- Spelling correction & Fuzzy search based on Symmetric Delete spelling correction algorithm.☆139Updated 4 months ago
- Fast English word segmentation in Rust☆100Updated last month
- Fast hierarchical agglomerative clustering in Rust.☆102Updated 6 months ago
- ☆49Updated 3 years ago
- ☆67Updated 2 years ago
- Fast approximate nearest neighbor searching in Rust, based on HNSW index☆336Updated last month
- Port of arc90labs-readability with rust☆131Updated last year
- finalfusion embeddings in Rust☆103Updated 2 years ago
- Rust implementation of JMESPath, a query language for JSON☆148Updated 3 months ago
- Rust client for txtai☆111Updated last month
- Xor filters - efficient probabilistic hashsets. Faster and smaller than bloom and cuckoo filters.☆144Updated last month
- Cross-platform embeddable sandboxing☆194Updated 2 months ago
- Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)☆121Updated last year
- JSON-LD implementation for Rust☆142Updated last month
- Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly☆222Updated 4 months ago
- A vectorized JSON parser for pre-validated, minified documents☆85Updated last year
- Native Rust port of Google's HighwayHash, which makes use of SIMD instructions for a fast and strong hash function☆172Updated 2 months ago
- [UNMAINTAINED] A transactional and deduplicating virtual file system☆98Updated last year
- 🗄️ A simple CLI for converting WARC to Parquet.☆113Updated 8 months ago
- The fastest and lightest mail parsing Rust library.☆178Updated 2 years ago
- Dachshund is a graph mining library written in Rust. It provides high performance data structures for multiple kinds of graphs, from simp…☆91Updated last year
- A collection of small notes that aren't appropriate for my blog.☆32Updated 3 years ago
- A WHATWG-compliant HTML5 tokenizer and tag soup parser☆164Updated 2 weeks ago
- Block-level copy-on-write tool☆17Updated 10 months ago
- A Rust implementation of fractional indexing.☆59Updated last year
- Simple NLP in Rust with Python bindings☆152Updated 2 years ago
- Locality Sensitive Hashing in Rust with Python bindings☆119Updated 2 years ago
- Context-sensitive word embeddings with subwords. In Rust.☆88Updated last year
- Hidden Markov Models in Rust☆78Updated last year
- A task orchestrator using redis, written in rust☆71Updated 2 years ago