SAI-Lab-NYU/QSVD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SAI-Lab-NYU/QSVD)

SAI-Lab-NYU / QSVD

This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value (QKV) weight compression in low-precision Vision-Language Models (VLMs).

☆28

Alternatives and similar repositories for QSVD

Users that are interested in QSVD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wangqinsi1 / Dobi-SVD
View on GitHub
[ICLR 2025] Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"
☆54Oct 19, 2025Updated 9 months ago
thu-nics / MBQ
View on GitHub
The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"
☆93Mar 17, 2025Updated last year
Yeyke / HBLLM
View on GitHub
[NeurIPS 2025 (spotlight)] HBLLM: Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs
☆16Dec 17, 2025Updated 7 months ago
StiphyJay / MQuant
View on GitHub
[ACM MM2025]: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
☆44Aug 13, 2025Updated 11 months ago
seanscott1991 / Duke_SQL4DQA
View on GitHub
☆16Nov 5, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
alibaba / EfficientAI
View on GitHub
☆48May 9, 2026Updated 2 months ago
mit-han-lab / lpd
View on GitHub
[ICLR 2026 Oral] Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation
☆104May 8, 2026Updated 2 months ago
leesou / Step-into-RISCV
View on GitHub
TA's implementation for the project of Computer Architecture and Intelligent Chip Design (23 Spring)
☆10May 20, 2023Updated 3 years ago
JRPan / crisp-artifact
View on GitHub
☆14Feb 5, 2025Updated last year
ZHITENGLI / ARB-LLM
View on GitHub
[ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models
☆31Aug 5, 2025Updated 11 months ago
PKULab1806 / Fairy2i-W2
View on GitHub
☆30Feb 6, 2026Updated 5 months ago
ZHITENGLI / AdaSVD
View on GitHub
PyTorch code for our paper "AdaSVD: Adaptive Singular Value Decomposition for Large Language Models"
☆15Mar 9, 2025Updated last year
Intelligent-Computing-Lab-Panda / GPTAQ
View on GitHub
Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)
☆92Jul 28, 2025Updated 11 months ago
AudioLabYork / SALTE-audio-renderer
View on GitHub
Standalone software tool for conducting spatial audio listening tests.
☆15Nov 29, 2021Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
inclusionAI / MoBE
View on GitHub
Mixture-of-Basis-Experts for Compressing MoE-based LLMs
☆37Dec 24, 2025Updated 7 months ago
kmk97 / Fast-Loc-NeRF
View on GitHub
[ICRA 2025] Fast Global Localization on Neural Radiance Field
☆17Jun 30, 2025Updated last year
mmlab-sigs / SizeGS
View on GitHub
☆16Oct 28, 2025Updated 8 months ago
ruikangliu / FlatQuant
View on GitHub
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
☆223Nov 25, 2025Updated 8 months ago
yihedeng9 / DuoGuard
View on GitHub
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
☆34Feb 26, 2025Updated last year
ant-research / M2-Miner
View on GitHub
[ICLR 2026] M2-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
☆55Apr 22, 2026Updated 3 months ago
HankYe / KVCOMM
View on GitHub
[NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems
☆17Nov 1, 2025Updated 8 months ago
CQU-EIE-Data-simulation-Lab / LLGS
View on GitHub
LLGS: Illuminating Gaussian Splatting via absorptance Modulation
☆20Oct 16, 2024Updated last year
zlab-princeton / llm-pruning-collection
View on GitHub
A collection of various llm pruning implementations, training code for GPUs & TPUs, and evaluation script.
☆69Apr 20, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
johndpope / ltx2-castlehill
View on GitHub
CastleHill: Separable Causal Diffusion / Varitaion Flow Maps for LTX-2 long-form video generation
☆15May 19, 2026Updated 2 months ago
Xiaoqiang-Yan / TNNLS-DCIB
View on GitHub
Cross-modal Clustering with Deep Correlated Information Bottleneck Method
☆11Aug 7, 2022Updated 3 years ago
hu-xianglong / harmonic-deformation
View on GitHub
A cage-based deformation for meshes in 2D.
☆14Sep 8, 2018Updated 7 years ago
leloykun / steepest-descent-lean
View on GitHub
Deriving steepest descent convergence bounds and hyperparameter scaling laws in machine learning optimization from first principles, form…
☆16Apr 11, 2026Updated 3 months ago
shadowpa0327 / Palu
View on GitHub
[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection
☆158Feb 20, 2025Updated last year
4m4n5 / CLIP-Lite
View on GitHub
Pytorch Implementation of CLIP-Lite | Accepted at AISTATS 2023
☆14Mar 17, 2023Updated 3 years ago
AlvinAi96 / wind_power_forecast
View on GitHub
KDD Cup 2022 Baidu Wind Power Forecast项目：百度风电功率预测赛 (Paddle Track 5th)
☆13Jul 29, 2022Updated 3 years ago
wenboluu / Ignition
View on GitHub
One command · One Microsoft login · Zero repeated auth Hours of uninterrupted access to NYU Torch from your terminal and IDE.
☆15Jul 17, 2026Updated last week
T2S-Bench / T2S-Bench
View on GitHub
This is Official implementation for T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasonin…
☆24Mar 5, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Ting-Justin-Jiang / ZEUS
View on GitHub
[ACM MM 2026]⚡ZEUS accelerates your diffuser. Any modality. Any model. Any scheduler. https://yixiao-wang-stats.github.io/zeus/
☆20Jun 2, 2026Updated last month
lixiang3776 / SCHAIN
View on GitHub
The code of SCHAIN in the paper Semi-supervised Clustering in Attributed Heterogeneous Information Networks
☆10Jul 29, 2019Updated 6 years ago
FDU-ctk / HSI-detection
View on GitHub
matlab code for hyperspectral target/anomaly detection
☆10Dec 18, 2020Updated 5 years ago
mlvlab / Representation-Shift
View on GitHub
Official Implementation (Pytorch) of the "Representation Shift: Unifying Token Compression with FlashAttention", ICCV 2025
☆36Feb 22, 2026Updated 5 months ago
OptimAI-Lab / CudaForge
View on GitHub
Official Repo of CudaForge
☆84Dec 2, 2025Updated 7 months ago
facebookresearch / any4
View on GitHub
Quantize transformers to any learned arbitrary 4-bit numeric format
☆59Jul 2, 2026Updated 3 weeks ago
fatmanryilmaz / WRXD
View on GitHub
Weighted Reed Xiaoli Detector
☆13Mar 8, 2021Updated 5 years ago