NJUDeepEngine / meteora
This repository contains the implementation of the paper "MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models".
☆16Updated 4 months ago
Alternatives and similar repositories for meteora:
Users that are interested in meteora are comparing it to the libraries listed below
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆35Updated last year
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆72Updated 5 months ago
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)☆34Updated last year
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆95Updated 9 months ago
- ThinK: Thinner Key Cache by Query-Driven Pruning☆18Updated last month
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆42Updated 5 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆54Updated 5 months ago
- Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models☆42Updated 4 months ago
- ☆73Updated last week
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆33Updated 9 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 10 months ago
- Official Pytorch Implementation of "OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning" b…☆31Updated 10 months ago
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆29Updated 11 months ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆86Updated 10 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆151Updated last year
- ☆61Updated 4 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆53Updated last month
- ☆17Updated 4 months ago
- ☆50Updated last year
- ☆8Updated 6 months ago
- Codes for Merging Large Language Models☆29Updated 7 months ago
- SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference☆45Updated 4 months ago
- Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"☆22Updated 9 months ago
- Code release for VTW (AAAI 2025) Oral☆33Updated 2 months ago
- [ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models☆18Updated 10 months ago
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆77Updated 9 months ago
- ☆72Updated last week
- Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)☆57Updated 6 months ago
- Accepted LLM Papers in NeurIPS 2024☆34Updated 5 months ago
- ☆50Updated last year