☆72Sep 19, 2025Updated 6 months ago
Alternatives and similar repositories for Qwen3-Quantization
Users that are interested in Qwen3-Quantization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (ICLR 2025) BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models☆26Oct 4, 2024Updated last year
- Artifact evaluation for HPCA'24 paper Lightening-Transformer: A Dynamically-operated Optically-interconnected Photonic Transformer Accele…☆11Mar 3, 2024Updated 2 years ago
- The official implementation of the AAAI 2024 paper Bi-ViT.☆12Dec 18, 2023Updated 2 years ago
- Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE Acceleration with Zero Computation Redundancy"☆15Mar 6, 2025Updated last year
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆38Sep 24, 2024Updated last year
- [ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models☆53Aug 9, 2024Updated last year
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- Quantum Hamiltonian Descent: numerical simulation, real-machine deployment, and benchmarking☆12Jan 16, 2024Updated 2 years ago
- ☆20Updated this week
- (NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models☆22Nov 20, 2024Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- ☆72Jun 20, 2025Updated 9 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- ☆16Mar 26, 2025Updated 11 months ago
- [CVPR 2024] PTQ4SAM: Post-Training Quantization for Segment Anything☆83Jun 26, 2024Updated last year
- (ICCV 2023) Official implementation of Rectified Straight Through Estimator (ReSTE).☆31Sep 20, 2024Updated last year
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆172Nov 26, 2025Updated 3 months ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆25Jun 16, 2025Updated 9 months ago
- Official implementation of "Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving"☆29May 8, 2025Updated 10 months ago
- Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)☆88Jul 28, 2025Updated 7 months ago
- [ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs☆19Jun 3, 2025Updated 9 months ago
- Coq & Haskell code for Calculating Correct Compilers II☆12Feb 22, 2022Updated 4 years ago
- ☆25Nov 10, 2021Updated 4 years ago
- [ACL 2025] RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis☆24Aug 8, 2025Updated 7 months ago
- [ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.☆109Dec 20, 2024Updated last year
- [COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"☆72Jul 8, 2025Updated 8 months ago
- LLM inference in C/C++☆26Updated this week
- Accelerating GOT-OCRv2 with VLLM☆11Nov 15, 2024Updated last year
- Active Example Selection for In-Context Learning (EMNLP'22)☆49Jul 22, 2024Updated last year
- mini is mini☆20Jan 19, 2020Updated 6 years ago
- Inverse Scaling in Test-Time Compute☆25Dec 3, 2025Updated 3 months ago
- [TECS'23] A project on the co-design of Accelerators and CNNs.☆21Dec 10, 2022Updated 3 years ago
- As defined in Lubotzky, Philips and Sarnak☆10Oct 25, 2022Updated 3 years ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Jul 30, 2024Updated last year
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆33May 1, 2025Updated 10 months ago
- ☆21Feb 5, 2024Updated 2 years ago
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆13Apr 17, 2025Updated 11 months ago
- General Image Classification Code base☆22Jul 12, 2021Updated 4 years ago