[ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
☆227Apr 23, 2026Updated this week
Alternatives and similar repositories for paroquant
Users that are interested in paroquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TinyNS: Platform-Aware Neurosymbolic Auto Tiny Machine Learning☆25Jun 2, 2023Updated 2 years ago
- GrFormer: A Novel Transformer on Grassmann Manifold for Infrared and Visible Image Fusion☆18Dec 14, 2025Updated 4 months ago
- Sketch Based Image Retrieval☆10Jul 13, 2018Updated 7 years ago
- ☆74Jun 20, 2025Updated 10 months ago
- ☆15Mar 21, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Minute-long video generation at 24FPS.☆64Mar 28, 2026Updated last month
- The official implementation of BiViT: Extremely Compressed Binary Vision Transformers☆16Jun 18, 2023Updated 2 years ago
- [ICCV 2025] QuantCache:Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation☆16Sep 26, 2025Updated 7 months ago
- ☆15Apr 26, 2025Updated last year
- ☆18Mar 18, 2024Updated 2 years ago
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"☆13Jun 7, 2023Updated 2 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication☆15Mar 25, 2024Updated 2 years ago
- A suite of tools for pretty printing, diffing, and exploring abstract syntax trees.☆15Mar 3, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.☆501Mar 30, 2026Updated 3 weeks ago
- A simple tool for managing sets of environment variables☆17Dec 25, 2025Updated 4 months ago
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- ☆10Oct 24, 2024Updated last year
- Build and runs code in a sandboxed macOS environment☆36Dec 5, 2025Updated 4 months ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated 2 weeks ago
- ☆43Apr 13, 2026Updated 2 weeks ago
- Code repo for the paper "SpinQuant LLM quantization with learned rotations"☆390Feb 14, 2025Updated last year
- The official implementation of "EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models"☆21Jul 8, 2025Updated 9 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Official Repo for Fast-SAM3D: 3Dfy Anything in Images but Faster☆127Mar 25, 2026Updated last month
- A synthesis flow for hybrid processing-in-RRAM modes☆12Jul 15, 2021Updated 4 years ago
- It's like Redis but a bit rusty...☆12Mar 3, 2026Updated last month
- Sniffer,大二网络编程的课程设计☆10Feb 28, 2022Updated 4 years ago
- A Python implementation of an agent swarm system that works with local LLM servers. The system allows you to create multiple agents that …☆13Nov 20, 2024Updated last year
- ☆14Mar 7, 2022Updated 4 years ago
- ☆30Jan 22, 2026Updated 3 months ago
- FPGA 2025 SAT Accel: A modern SAT Solver on FPGA Repository☆14Mar 13, 2025Updated last year
- Turning messy repos into weapons of mass structured context.☆22Feb 20, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Chinese documentation for RWKV | RWKV官方中文文档☆15Apr 16, 2026Updated last week
- A hackable library for running and fine-tuning modern transformer models on commodity and alternative GPUs, powered by tinygrad.☆29Feb 10, 2026Updated 2 months ago
- ☆12Jun 17, 2024Updated last year
- ☆19Apr 22, 2026Updated last week
- LMTuner: Make the LLM Better for Everyone☆38Sep 21, 2023Updated 2 years ago
- An interactive TUI for visualizing code statistics from tokei.☆35Jan 20, 2026Updated 3 months ago
- The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …☆51Jun 6, 2025Updated 10 months ago