zipnn / zipnnLinks

A Lossless Compression Library for AI pipelines

☆274

Alternatives and similar repositories for zipnn

Users that are interested in zipnn are comparing it to the libraries listed below

Sorting:

run-ai / runai-model-streamer
☆232Updated this week
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆118Updated 6 months ago
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆154Updated 9 months ago
facebookresearch / fastgen
Simple high-throughput inference library
☆125Updated 2 months ago
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆266Updated 10 months ago
DeepAuto-AI / hip-attention
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
☆142Updated this week
huggingface / kernels
Load compute kernels from the Hub
☆233Updated this week
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆198Updated last year
vllm-project / guidellm
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
☆479Updated this week
ScalingIntelligence / tokasaurus
☆392Updated this week
pytorch / torchft
Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)
☆377Updated this week
evanatyourservice / kron_torch
An implementation of PSGD Kron second-order optimizer for PyTorch
☆94Updated 2 weeks ago
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆218Updated this week
bloc97 / DeMo
DeMo: Decoupled Momentum Optimization
☆190Updated 8 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆82Updated 2 months ago
facebookresearch / spdl
Scalable and Performant Data Loading
☆291Updated this week
huggingface / kernel-builder
👷 Build compute kernels
☆93Updated this week
QuixiAI / grokadamw
☆134Updated 11 months ago
huggingface / gpu-fryer
Where GPUs get cooked 👩‍🍳🔥
☆274Updated this week
IST-DASLab / Quartet
☆75Updated last month
pytorch-labs / monarch
PyTorch Single Controller
☆345Updated this week
skypilot-org / skypilot-catalog
☆25Updated this week
google-deepmind / asyncdiloco
☆45Updated last year
Dan-wanna-M / formatron
Formatron empowers everyone to control the format of language models' output with minimal overhead.
☆221Updated 2 months ago
jlscheerer / xtr-warp
XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.
☆152Updated 3 months ago
QuixiAI / spectrum
☆129Updated 4 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆184Updated 6 months ago
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆137Updated last year
AnswerDotAI / fastkmeans
☆64Updated last month
NVIDIA-NeMo / Run
A tool to configure, launch and manage your machine learning experiments.
☆176Updated this week