[ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
☆313Jun 8, 2026Updated 3 weeks ago
Alternatives and similar repositories for paroquant
Users that are interested in paroquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- QuIP quantization☆66Mar 17, 2024Updated 2 years ago
- Pure MLX implementations of UMAP, t-SNE, PaCMAP, TriMap, DREAMS, CNE, MMAE, and NNDescent for Apple Silicon. Metal GPU for computation an…☆87Mar 20, 2026Updated 3 months ago
- ☆34Mar 28, 2025Updated last year
- Sketch Based Image Retrieval☆10Jul 13, 2018Updated 7 years ago
- CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs☆216Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆81Jun 20, 2025Updated last year
- ☆15Mar 21, 2025Updated last year
- Dice Language Support for VS Code☆10Sep 29, 2020Updated 5 years ago
- Probabilistic Circuits in Julia☆10Dec 27, 2023Updated 2 years ago
- The official implementation of BiViT: Extremely Compressed Binary Vision Transformers☆16Jun 18, 2023Updated 3 years ago
- Low level library for DiffSinger onnx model inference.☆15Apr 11, 2025Updated last year
- ☆15Apr 26, 2025Updated last year
- ☆18Mar 18, 2024Updated 2 years ago
- The link to the stored-in-image imagenet64x64 dataset. And a resnet/wrn code for it.☆15Aug 24, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- wav2svp: Waveform & pitchs to Synthesizer V Project☆17Jan 9, 2025Updated last year
- ☆24Jan 30, 2025Updated last year
- A repo based on XiLin Li's PSGD repo that extends some of the experiments.☆14Oct 7, 2024Updated last year
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆38Apr 30, 2026Updated last month
- Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication☆15Mar 25, 2024Updated 2 years ago
- The browser-native agent framework☆204Jun 20, 2026Updated last week
- A suite of tools for pretty printing, diffing, and exploring abstract syntax trees.☆18Mar 3, 2026Updated 3 months ago
- A NFC card reader for Campus card of NEU ( China )☆12Mar 13, 2021Updated 5 years ago
- [ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.☆507Mar 30, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- ☆10Oct 24, 2024Updated last year
- Official implementation of "VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis"☆21Jan 26, 2025Updated last year
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated 2 months ago
- ☆13Jun 29, 2024Updated last year
- Code repo for the paper "SpinQuant LLM quantization with learned rotations"☆406Feb 14, 2025Updated last year
- The official implementation of "EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models"☆21Jul 8, 2025Updated 11 months ago
- 🎨 Single-file distributable React posters — one .tsx file, every format you'll ever need. Works as a CLI and as a library.☆69May 16, 2026Updated last month
- ☆14Mar 7, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A Python implementation of an agent swarm system that works with local LLM servers. The system allows you to create multiple agents that …☆14Nov 20, 2024Updated last year
- [ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregation☆24May 29, 2025Updated last year
- Turning messy repos into weapons of mass structured context.☆23Feb 20, 2026Updated 4 months ago
- ☆12Jun 17, 2024Updated 2 years ago
- A hackable library for running and fine-tuning modern transformer models on commodity and alternative GPUs, powered by tinygrad.☆30Feb 10, 2026Updated 4 months ago
- ☆30Apr 29, 2026Updated 2 months ago
- ☆10Dec 10, 2024Updated last year