cjyaras / deep-lora-transformers
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)
☆11Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for deep-lora-transformers
- ☆47Updated last year
- ☆24Updated 5 months ago
- [NAACL 24 Oral] LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models☆26Updated 2 months ago
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Updated last year
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆24Updated 3 weeks ago
- ☆32Updated last year
- Bayesian low-rank adaptation for large language models☆23Updated 6 months ago
- LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters☆24Updated 3 weeks ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆15Updated 5 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆59Updated 7 months ago
- Bayesian Low-Rank Adaptation for Large Language Models☆28Updated 5 months ago
- ☆10Updated last month
- ☆16Updated 10 months ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆43Updated 6 months ago
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆75Updated 4 months ago
- [ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen…☆27Updated last year
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆28Updated last month
- Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)☆53Updated last month
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆33Updated last month
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆21Updated 6 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- Source code of EMNLP 2022 Findings paper "SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters"☆19Updated 7 months ago
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆41Updated 2 years ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆36Updated 3 weeks ago
- Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic☆16Updated 2 months ago
- Code for paper: “What Data Benefits My Classifier?” Enhancing Model Performance and Interpretability through Influence-Based Data Selecti…☆22Updated 6 months ago
- [ATTRIB @ NeurIPS 2024 Oral] When Attention Sink Emerges in Language Models: An Empirical View☆29Updated last month
- [ICML2022] Training Your Sparse Neural Network Better with Any Mask. Ajay Jaiswal, Haoyu Ma, Tianlong Chen, ying Ding, and Zhangyang Wang☆26Updated 2 years ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆30Updated 8 months ago
- Code for the paper "Mehta, S. V., Patil, D., Chandar, S., & Strubell, E. (2023). An Empirical Investigation of the Role of Pre-training i…☆16Updated 8 months ago