prateeky2806 / ties-merging
☆153Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for ties-merging
- Code accompanying the paper "Massive Activations in Large Language Models"☆123Updated 8 months ago
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆85Updated last year
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆144Updated last month
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆52Updated 3 weeks ago
- ☆81Updated last year
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆87Updated last year
- AI Logging for Interpretability and Explainability🔬☆89Updated 5 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆97Updated 7 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆119Updated last month
- Building modular LMs with parameter-efficient fine-tuning.☆86Updated this week
- Benchmarking LLMs with Challenging Tasks from Real Users☆198Updated 3 weeks ago
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆64Updated 5 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆78Updated 8 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆108Updated 2 months ago
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆143Updated last week
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆32Updated last month
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- Self-Alignment with Principle-Following Reward Models☆147Updated 8 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆91Updated 4 months ago
- ☆63Updated 2 years ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆139Updated 2 months ago
- A Survey on Data Selection for Language Models☆183Updated last month
- The Paper List on Data Contamination for Large Language Models Evaluation.☆76Updated this week
- ☆122Updated 10 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆307Updated 7 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆96Updated last month
- ☆128Updated 6 months ago
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆220Updated this week
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆129Updated 2 months ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆50Updated 7 months ago