originell / smaz-py3Links
Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+
☆13Updated 3 weeks ago
Alternatives and similar repositories for smaz-py3
Users that are interested in smaz-py3 are comparing it to the libraries listed below
Sorting:
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Updated 2 years ago
- Loadable spellfix1 extension for sqlite as python package☆26Updated last year
- Open Jupyter notebooks from GitHub repositories or URLs directly in Jupyter.☆32Updated 6 months ago
- Efficiently computing & storing token n-grams from large corpora☆26Updated 10 months ago
- ☆41Updated 3 months ago
- Python bindings for simdjson using libpy☆67Updated 2 years ago
- NLP with Rust for Python 🦀🐍☆64Updated 2 months ago
- Fast Text Classification with Compressors dictionary☆150Updated last year
- Libzim binding for Python: read/write ZIM files in Python☆92Updated 3 months ago
- ☆18Updated last year
- Sqlite3-based logging for Python☆14Updated last year
- A file utility for accessing both local and remote files through a unified interface.☆43Updated 2 months ago
- A polite and user-friendly downloader for Common Crawl data☆51Updated last month
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated 2 years ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- WASM-powered sandbox implementation of exec() for safely running dynamic Python code☆36Updated last year
- ☆20Updated 2 months ago
- ☆76Updated 7 months ago
- Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers☆145Updated 7 months ago
- Python module for detection of CPU features☆29Updated 5 months ago
- LLM plugin for clustering embeddings☆80Updated last year
- Python JSON benchmarking and "correctness".☆34Updated last year
- Efficient BM25 with DuckDB 🦆☆54Updated 7 months ago
- Python bindings for RocksDB☆35Updated 3 years ago
- Tree-based indexes for neural-search☆32Updated last year
- Simple implementation of a GPT (training and inference) in PyTorch.☆12Updated last year
- ☆59Updated 3 weeks ago
- Pre-train Static Word Embeddings☆85Updated 2 months ago
- image perceptual hash based on ML☆28Updated 3 years ago
- Library for fast text representation and classification.☆31Updated last year