VijayLingam95 / SVFT
☆21Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for SVFT
- ☆27Updated last year
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆44Updated last year
- Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)☆52Updated last month
- ☆20Updated last week
- ☆61Updated 2 years ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆28Updated 7 months ago
- ☆34Updated 8 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆34Updated 2 weeks ago
- ☆34Updated 3 months ago
- ☆61Updated 2 months ago
- ☆29Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆49Updated last year
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆49Updated 2 weeks ago
- ☆50Updated 5 months ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆69Updated 8 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- ☆31Updated last year
- [ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical View☆27Updated 3 weeks ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆84Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆49Updated 2 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆121Updated 8 months ago
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆20Updated 5 months ago
- ☆149Updated 9 months ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆25Updated 5 months ago
- Exploring Model Kinship for Merging Large Language Models☆18Updated last week
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆95Updated 7 months ago
- ☆50Updated last week
- ☆44Updated last year
- The official implementation of the paper "Demystifying the Compression of Mixture-of-Experts Through a Unified Framework".☆48Updated 2 weeks ago
- ☆14Updated 11 months ago