[ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
☆300Jun 8, 2026Updated this week
Alternatives and similar repositories for paroquant
Users that are interested in paroquant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The StereOS agent management daemon.☆44May 15, 2026Updated 3 weeks ago
- QuIP quantization☆66Mar 17, 2024Updated 2 years ago
- Pure MLX implementations of UMAP, t-SNE, PaCMAP, TriMap, DREAMS, CNE, MMAE, and NNDescent for Apple Silicon. Metal GPU for computation an…☆86Mar 20, 2026Updated 2 months ago
- A Python library for automatic English to Katakana conversion☆17Mar 25, 2026Updated 2 months ago
- Image Gaussian Splatting☆25Jul 21, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆175Jun 22, 2025Updated 11 months ago
- ☆79Jun 20, 2025Updated 11 months ago
- Create, manage, and orchestrate stereOS AI agent sandboxes.☆73May 14, 2026Updated 3 weeks ago
- Repo for PyChart 1.39, refs http://download.gna.org/pychart/☆10Sep 29, 2014Updated 11 years ago
- ☆15Mar 21, 2025Updated last year
- ☆15Jun 1, 2026Updated last week
- The Official PyTorch implementation of Shared LoRA Subspaces for almost Strict Continual Learning☆32May 7, 2026Updated last month
- Local AI runtime for training & running small LLMs directly on Apple Neural Engine (ANE). No CoreML. No Metal. Offline, on-device fine-tu…☆97Mar 6, 2026Updated 3 months ago
- The official implementation of BiViT: Extremely Compressed Binary Vision Transformers☆16Jun 18, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Low level library for DiffSinger onnx model inference.☆15Apr 11, 2025Updated last year
- ☆15Apr 26, 2025Updated last year
- The Wado Programming Language☆90Updated this week
- TUI for browsing, canceling, and inspecting SLURM jobs☆13Nov 13, 2023Updated 2 years ago
- Official PyTorch implementation of paper MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling☆13Oct 5, 2024Updated last year
- The link to the stored-in-image imagenet64x64 dataset. And a resnet/wrn code for it.☆15Aug 24, 2022Updated 3 years ago
- ☆24Jan 30, 2025Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- Artifacts for ATC '22 paper "Faster Software Packet Processing on FPGA NICs with eBPF Program Warping"☆17May 20, 2022Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆38Apr 30, 2026Updated last month
- Peakflo Unified Model Context Protocol (pfMCP)☆19Updated this week
- Zippy Talking Avatar uses Azure Cognitive Services and OpenAI API to generate text and speech. It is built with Next.js and Tailwind CSS.…☆16Feb 9, 2024Updated 2 years ago
- A suite of tools for pretty printing, diffing, and exploring abstract syntax trees.☆16Mar 3, 2026Updated 3 months ago
- [ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactiv…☆766Updated this week
- Course materials for 11-767☆13Nov 10, 2022Updated 3 years ago
- [ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.☆506Mar 30, 2026Updated 2 months ago
- ☆10Oct 24, 2024Updated last year
- [GSI 2023] Learning Lagrangian Fluid Mechanics with E(3)-Equivariant GNNs☆15Jun 3, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Bridge for audio transcription between Open-WebUI and Whisper, returns text in JSON format.☆17Nov 5, 2024Updated last year
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated last month
- Termite ML inference service for embeddings, chunking, and reranking☆27Apr 21, 2026Updated last month
- 🎨 Single-file distributable React posters — one .tsx file, every format you'll ever need. Works as a CLI and as a library.☆67May 16, 2026Updated 3 weeks ago
- The official implementation of "EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models"☆21Jul 8, 2025Updated 11 months ago
- Sniffer,大二网络编程的课程设计☆10Feb 28, 2022Updated 4 years ago
- Nightly Build for LMDeploy☆11Jan 28, 2025Updated last year