zipnn / zipnnLinks
A Lossless Compression Library for AI pipelines
☆290Updated 6 months ago
Alternatives and similar repositories for zipnn
Users that are interested in zipnn are comparing it to the libraries listed below
Sorting:
- ☆275Updated last week
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Updated last year
- Google TPU optimizations for transformers models☆132Updated 3 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated last month
- 👷 Build compute kernels☆201Updated this week
- A safetensors extension to efficiently store sparse quantized tensors on disk☆233Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆283Updated 4 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆201Updated last year
- Simple high-throughput inference library☆155Updated 8 months ago
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆139Updated last year
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆131Updated last year
- Scalable and Performant Data Loading☆360Updated last week
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆276Updated this week
- PyTorch implementation of models from the Zamba2 series.☆185Updated 11 months ago
- ☆47Updated last year
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆467Updated 2 weeks ago
- Manage ML configuration with pydantic☆16Updated 3 weeks ago
- ☆459Updated last month
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆106Updated 7 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorch☆97Updated 5 months ago
- Load compute kernels from the Hub☆359Updated this week
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆212Updated last week
- REAP: Router-weighted Expert Activation Pruning for SMoE compression☆189Updated last month
- ☆27Updated this week
- ☆235Updated last week
- Inference server benchmarking tool☆136Updated 3 months ago
- LM engine is a library for pretraining/finetuning LLMs☆108Updated this week
- Train, tune, and infer Bamba model☆137Updated 7 months ago
- A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM☆190Updated this week
- ☆138Updated 4 months ago