iLearn-Lab/ACL25-PTQ1.61

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/iLearn-Lab/ACL25-PTQ1.61)

iLearn-Lab / ACL25-PTQ1.61

☆15

Alternatives and similar repositories for ACL25-PTQ1.61

Users that are interested in ACL25-PTQ1.61 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

YU-deep / MACT
View on GitHub
☆18Jul 31, 2025Updated 8 months ago
ShiheWang / FIMA-Q
View on GitHub
[CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
☆26Jun 16, 2025Updated 9 months ago
HaoxuanXU1024 / IRPO
View on GitHub
☆28Nov 28, 2025Updated 4 months ago
LeanModels / LeanQuant
View on GitHub
Code repository for ICLR 2025 paper "LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid"
☆27Mar 2, 2025Updated last year
disi-unibo-nlp / bio-ee-egv
View on GitHub
[COLING22] Text-to-Text Extraction and Verbalization of Biomedical Event Graphs
☆10Nov 5, 2022Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
enyac-group / Quamba
View on GitHub
The official repository of Quamba1 [ICLR 2025] & Quamba2 [ICML 2025]
☆67Jun 19, 2025Updated 9 months ago
christian42mmreason / ActivationReplay
View on GitHub
☆21Dec 3, 2025Updated 4 months ago
Intelligent-Computing-Lab-Panda / TesseraQ
View on GitHub
☆25Oct 31, 2024Updated last year
YU-deep / VisMem
View on GitHub
☆79Feb 5, 2026Updated 2 months ago
Aaronhuang-778 / BiLLM
View on GitHub
[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
☆229Jan 11, 2025Updated last year
xxtvrxx233 / OplusOTAbr2dat2img
View on GitHub
Automatically convert the br file from Oplus Ota to img format image for flash in
☆11Oct 13, 2025Updated 5 months ago
HelloYym / EmbeddedSystemCourse
View on GitHub
嵌入式课程实验及作业
☆13Jun 30, 2016Updated 9 years ago
pixeli99 / MixLN
View on GitHub
[ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…
☆29Jul 24, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
metacarbon / shareAtt
View on GitHub
Beyond KV Caching: Shared Attention for Efficient LLMs
☆20Jul 19, 2024Updated last year
ASISys / AdaSkip
View on GitHub
AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
☆21Jan 24, 2025Updated last year
htqin / IR-QLoRA
View on GitHub
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆65Apr 15, 2024Updated last year
shoaibahmed / llm_depth_pruning
View on GitHub
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Jul 24, 2024Updated last year
midoks / dagger
View on GitHub
A small sharp knife to eat meat
☆13Mar 9, 2023Updated 3 years ago
WangWenhao0716 / PDF-Embedding
View on GitHub
[NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"
☆18Oct 1, 2024Updated last year
goddoe / RLYX
View on GitHub
A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.
☆37Aug 27, 2025Updated 7 months ago
Egg-Hu / SMI
View on GitHub
[ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination
☆14Apr 29, 2025Updated 11 months ago
dongwonjo / FastKV
View on GitHub
Official Implementation of FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration
☆30Updated this week
DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
thu-nics / ViDiT-Q
View on GitHub
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
☆157Mar 21, 2025Updated last year
DensoITLab / bitprune
View on GitHub
☆11Apr 5, 2023Updated 3 years ago
thu-nics / MBQ
View on GitHub
The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"
☆84Mar 17, 2025Updated last year
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated last year
MaxBelitsky / cache-steering
View on GitHub
KV Cache Steering for Inducing Reasoning in Small Language Models
☆48Jul 24, 2025Updated 8 months ago
gmlwns2000 / sea-attention
View on GitHub
Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)
☆12Jun 20, 2025Updated 9 months ago
euiin / SMART
View on GitHub
SMART introduces a novel test-time framework where Small Language Models (SLMs) reason step-by-step, and Large Language Models (LLMs) pro…
☆11Jul 9, 2025Updated 9 months ago
Peyton-Chen / Sparse-vDiT
View on GitHub
The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …
☆51Jun 6, 2025Updated 10 months ago
shawnricecake / quart-depth
View on GitHub
[CVPR 2025] QuartDepth
☆17Mar 24, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
jiwonsong-dev / ReasoningPathCompression
View on GitHub
[NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"
☆32Oct 20, 2025Updated 5 months ago
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆12Oct 22, 2024Updated last year
rees-c / PyREMBO
View on GitHub
Python implementation of REMBO built on GPyTorch.
☆18Jul 11, 2020Updated 5 years ago
linkedin / ControlLLM
View on GitHub
Control LLM
☆22Apr 6, 2025Updated last year
yueliu1999 / KLong
View on GitHub
☆70Mar 2, 2026Updated last month
hellozhuo / msgc
View on GitHub
Source code of our TNNLS paper "Boosting Convolutional Neural Networks with Middle Spectrum Grouped Convolution"
☆12Apr 14, 2023Updated 2 years ago
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago