zipnn / zipnnLinks
A Lossless Compression Library for AI pipelines
☆250Updated 2 months ago
Alternatives and similar repositories for zipnn
Users that are interested in zipnn are comparing it to the libraries listed below
Sorting:
- ☆222Updated this week
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆200Updated this week
- ☆68Updated this week
- Inference server benchmarking tool☆74Updated 2 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆129Updated this week
- Google TPU optimizations for transformers models☆113Updated 5 months ago
- 🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.☆47Updated this week
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆358Updated this week
- Manage ML configuration with pydantic☆16Updated last month
- Simple high-throughput inference library☆119Updated last month
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆80Updated last month
- DeMo: Decoupled Momentum Optimization☆188Updated 6 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆130Updated this week
- ☆137Updated this week
- PyTorch Single Controller☆231Updated this week
- ☆93Updated last month
- QuIP quantization☆54Updated last year
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 8 months ago
- Scalable and Performant Data Loading☆278Updated this week
- InferX is a Inference Function as a Service Platform☆111Updated last week
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆137Updated this week
- PyTorch per step fault tolerance (actively under development)☆329Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆264Updated 8 months ago
- Top papers related to LLM-based agent evaluation☆70Updated 2 weeks ago
- ☆124Updated 2 months ago
- A tiny LLM Agent with minimal dependencies, focused on local inference.☆53Updated 8 months ago
- ☆182Updated 2 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆185Updated 3 weeks ago
- Cray-LM unified training and inference stack.☆22Updated 4 months ago
- ☆53Updated last year