zipnn / zipnnLinks
A Lossless Compression Library for AI pipelines
☆275Updated last month
Alternatives and similar repositories for zipnn
Users that are interested in zipnn are comparing it to the libraries listed below
Sorting:
- Google TPU optimizations for transformers models☆120Updated 7 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Updated 10 months ago
- ☆238Updated this week
- Simple high-throughput inference library☆127Updated 3 months ago
- Load compute kernels from the Hub☆258Updated this week
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆200Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated 10 months ago
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆145Updated this week
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆129Updated 8 months ago
- ☆85Updated last week
- ☆102Updated 3 weeks ago
- 👷 Build compute kernels☆119Updated this week
- ☆407Updated last week
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆85Updated 3 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆153Updated this week
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆245Updated 7 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆389Updated 2 weeks ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆331Updated 3 months ago
- Scalable and Performant Data Loading☆295Updated this week
- ☆134Updated last week
- RWKV-7: Surpassing GPT☆94Updated 9 months ago
- PyTorch implementation of models from the Zamba2 series.☆184Updated 7 months ago
- 🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.☆47Updated this week
- Utils for Unsloth https://github.com/unslothai/unsloth☆134Updated this week
- DeMo: Decoupled Momentum Optimization☆190Updated 9 months ago
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆206Updated last week
- Storing long contexts in tiny caches with self-study☆145Updated last week
- ☆68Updated last month
- Train, tune, and infer Bamba model☆131Updated 2 months ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆223Updated this week