zipnn / zipnnLinks
A Lossless Compression Library for AI pipelines
☆248Updated last month
Alternatives and similar repositories for zipnn
Users that are interested in zipnn are comparing it to the libraries listed below
Sorting:
- Top papers related to LLM-based agent evaluation☆68Updated 2 weeks ago
- ☆215Updated this week
- 🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.☆44Updated this week
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆196Updated this week
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆317Updated this week
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 7 months ago
- Inference server benchmarking tool☆68Updated last month
- Official implementation of "Dataset Size Recovery from LoRA Weights" paper.☆33Updated 11 months ago
- ☆37Updated last month
- Google TPU optimizations for transformers models☆112Updated 4 months ago
- DeMo: Decoupled Momentum Optimization☆188Updated 6 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆136Updated 2 weeks ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆60Updated 3 weeks ago
- Simple high-throughput inference library☆115Updated 3 weeks ago
- ☆180Updated last month
- ☆130Updated 2 months ago
- Cray-LM unified training and inference stack.☆22Updated 4 months ago
- TokenSHAP: Explain individual token importance in large language model prompts with SHAP values. Gain insights, debug models, detect bias…☆45Updated 2 months ago
- Module, Model, and Tensor Serialization/Deserialization☆234Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆263Updated 7 months ago
- Formatron empowers everyone to control the format of language models' output with minimal overhead.☆200Updated last week
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆301Updated last month
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆95Updated 5 months ago
- Scalable and Performant Data Loading☆269Updated last week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆249Updated this week
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆47Updated last week
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 7 months ago
- Where GPUs get cooked 👩🍳🔥☆233Updated 3 months ago
- ☆46Updated last week
- Official PyTorch Implementation for the "Recovering the Pre-Fine-Tuning Weights of Generative Models" paper (ICML 2024).☆79Updated last month