originell / smaz-py3Links
Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+
☆13Updated 2 months ago
Alternatives and similar repositories for smaz-py3
Users that are interested in smaz-py3 are comparing it to the libraries listed below
Sorting:
- Python JSON benchmarking and "correctness".☆36Updated 2 years ago
- Loadable spellfix1 extension for sqlite as python package☆26Updated last year
- A high-performance library for compressed ndarrays, with a flexible computational engine☆183Updated this week
- ☆23Updated 6 months ago
- Libzim binding for Python: read/write ZIM files in Python☆96Updated last week
- An excellent developer tool for excellent developers☆13Updated last month
- WASM-powered sandbox implementation of exec() for safely running dynamic Python code☆36Updated last year
- ☆18Updated last year
- Jupyter Textual-based Widget☆15Updated last year
- NoPdb: Non-interactive Python Debugger☆84Updated 3 years ago
- Pretraining data reconstruction scripts for Apertus☆109Updated last month
- Python bindings for simdjson using libpy☆69Updated 2 years ago
- 🔤 Measure edit distance based on keyboard layout☆63Updated 2 months ago
- NLP with Rust for Python 🦀🐍☆70Updated 7 months ago
- Grammars suitable for lark parser and Hypothesis☆53Updated last year
- Tooling for exact and MinHash deduplication of large-scale text datasets☆44Updated this week
- An easy Python framework to build distributed systems☆50Updated last year
- Happy Eyeballs for pre-resolved hosts☆36Updated this week
- Tree-based indexes for neural-search☆31Updated last year
- Python bindings for RocksDB☆35Updated 3 years ago
- Efficient BM25 with DuckDB 🦆☆59Updated last year
- A polite and user-friendly downloader for Common Crawl data☆63Updated 4 months ago
- Simple implementation of a GPT (training and inference) in PyTorch.☆13Updated 2 years ago
- Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers☆153Updated last year
- A file utility for accessing both local and remote files through a unified interface.☆44Updated this week
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Updated 2 years ago
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp☆44Updated last year
- Allows to check regexes for overlaps. Based on greenery by @qntm.☆56Updated last year
- image perceptual hash based on ML☆28Updated 3 years ago
- ☆46Updated last month