SqueezeBits/GraLoRA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SqueezeBits/GraLoRA)

SqueezeBits / GraLoRA

☆35

Alternatives and similar repositories for GraLoRA

Users that are interested in GraLoRA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zwhe99 / RaSA
View on GitHub
[ICLR 2025] RaSA: Rank-Sharing Low-Rank Adaptation
☆10May 19, 2025Updated last year
icip-cas / SSO
View on GitHub
A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…
☆20Nov 21, 2024Updated last year
jiwonsong-dev / ReasoningPathCompression
View on GitHub
[NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"
☆32Oct 20, 2025Updated 9 months ago
dongwonjo / FastKV
View on GitHub
[ACL Findings 2026] Official Implementation of "FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acc…
☆32Apr 14, 2026Updated 3 months ago
hangeol / UniR
View on GitHub
Official repo for paper: Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs
☆20Nov 26, 2025Updated 7 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago
Bluear7878 / H2-Cache-A-Hierarchical-Dual-Stage-Cache
View on GitHub
☆22Nov 3, 2025Updated 8 months ago
WalkerWorldPeace / DOGE
View on GitHub
Official implementation of "Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent".
☆23May 23, 2025Updated last year
jessemelpolio / LMM_CL
View on GitHub
Codes for: How to Teach Large Multimodal Models New Skills?
☆30Oct 10, 2025Updated 9 months ago
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Jul 15, 2026Updated last week
OpenGVLab / LLMPrune-BESA
View on GitHub
BESA is a differentiable weight pruning technique for large language models.
☆17Mar 4, 2024Updated 2 years ago
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
hjeon2k / LRAgent
View on GitHub
Official implementation of LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents
☆26Feb 1, 2026Updated 5 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Yui010206 / MEXA
View on GitHub
[EMNLP 2025 Findings] MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation
☆15Aug 22, 2025Updated 11 months ago
MarkXCloud / CSpD
View on GitHub
The official repo of continuous speculative decoding
☆36Mar 28, 2025Updated last year
BidyutSaha / TinyTNAS
View on GitHub
TinyTNAS is a hardware-aware, multi-objective, time-bound Neural Architecture Search (NAS) tool designed for TinyML time series classific…
☆22Dec 11, 2024Updated last year
oujieww / ANPD
View on GitHub
☆11Feb 5, 2026Updated 5 months ago
tasl-lab / RDD
View on GitHub
Official implementation of "RDD: Retrieval-Based Demonstration Decomposer for Planner Alignment in Long-Horizon Tasks" NeurIPS 2025.
☆15Updated this week
iLearn-Lab / ACL25-PTQ1.61
View on GitHub
☆15Apr 6, 2026Updated 3 months ago
lmbxmu / CutDiffusion
View on GitHub
CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
☆27Oct 9, 2025Updated 9 months ago
GangweiJiang / FvForgetting
View on GitHub
☆15Apr 20, 2025Updated last year
soufiane001 / plop
View on GitHub
Official code for PLoP
☆20Mar 6, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆176Nov 26, 2025Updated 8 months ago
Rishit-dagli / Squeeze3D
View on GitHub
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor
☆23Jun 12, 2025Updated last year
Qualcomm-AI-research / lr-qat
View on GitHub
☆54Nov 5, 2024Updated last year
antgroup / importance-aware-sparse-tuning-IST-paper
View on GitHub
☆22Dec 23, 2024Updated last year
ignoww / ZOODiP
View on GitHub
[CVPR 2025] Efficient Personalization of Quantized Diffusion Model without Backpropagation
☆17Mar 31, 2025Updated last year
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated 2 years ago
JiusiServe / LongVideoSparseAttention
View on GitHub
Long Video Sparse Attention
☆18Jul 17, 2026Updated last week
BaichuanSEED / BaichuanSEED.github.io
View on GitHub
Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…
☆18Aug 28, 2024Updated last year
Zcchill / Value-Residual-Learning
View on GitHub
☆15Mar 20, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
gccnlp / Light-PEFT
View on GitHub
[ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
☆13Sep 2, 2024Updated last year
JetBrains-Research / codegen-metrics
View on GitHub
Replication package for evaluation of code generation metrics
☆17Nov 24, 2025Updated 8 months ago
sade-adrien / SteloCoder
View on GitHub
☆16Dec 21, 2023Updated 2 years ago
pOpsPaper / pOps
View on GitHub
Official implementation for "pOps: Photo-Inspired Diffusion Operators"
☆86Jul 23, 2024Updated 2 years ago
ASISys / AdaSkip
View on GitHub
AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
☆21Jan 24, 2025Updated last year
saqibjaved1 / QT-DoG
View on GitHub
[ICML 2025] QT-DOG: QUANTIZATION-AWARE TRAINING FOR DOMAIN GENERALIZATION
☆25Nov 30, 2025Updated 7 months ago
rd-vla / rd-vla
View on GitHub
Official Repository for RD-VLA
☆40Mar 12, 2026Updated 4 months ago