juyongjiang / KaSALinks
[ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"
☆18Updated 5 months ago
Alternatives and similar repositories for KaSA
Users that are interested in KaSA are comparing it to the libraries listed below
Sorting:
- This is a simple torch implementation of the high performance Multi-Query Attention☆17Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆25Updated 6 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆40Updated 8 months ago
- ☆18Updated 5 months ago
- Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More☆31Updated last month
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆38Updated last year
- ☆23Updated 3 months ago
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆20Updated last year
- ☆26Updated last year
- ☆73Updated 2 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 11 months ago
- ☆20Updated last year
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆30Updated last week
- ☆68Updated last year
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆97Updated this week
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆28Updated last year
- A repository for research on medium sized language models.☆77Updated last year
- Recycling diverse models☆45Updated 2 years ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆27Updated 2 months ago
- ☆33Updated last year
- Lottery Ticket Adaptation☆39Updated 7 months ago
- MatFormer repo☆43Updated 7 months ago
- ☆19Updated 4 months ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆61Updated last year
- ☆19Updated 3 months ago
- Codebase for Instruction Following without Instruction Tuning☆35Updated 9 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆56Updated 10 months ago
- ☆36Updated 9 months ago
- We study toy models of skill learning.☆29Updated 5 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆52Updated 5 months ago