prateeky2806 / ties-merging
☆136Updated 7 months ago
Related projects: ⓘ
- Code accompanying the paper "Massive Activations in Large Language Models"☆104Updated 6 months ago
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆80Updated last year
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆130Updated 2 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆96Updated last week
- AI Logging for Interpretability and Explainability🔬☆74Updated 3 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆109Updated 11 months ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆64Updated 6 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆76Updated 6 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆107Updated last month
- ☆69Updated 10 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆79Updated last year
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆106Updated this week
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆84Updated 5 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆87Updated 2 months ago
- Self-Alignment with Principle-Following Reward Models☆144Updated 6 months ago
- LLM-Merging: Building LLMs Efficiently through Merging☆165Updated last week
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆161Updated last week
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆290Updated 5 months ago
- ☆117Updated 7 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆182Updated last month
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆63Updated 3 months ago
- contrastive decoding☆174Updated last year
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆107Updated this week
- ☆239Updated 10 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning☆168Updated 4 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆40Updated 2 weeks ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆135Updated last month
- Language models scale reliably with over-training and on downstream tasks☆91Updated 5 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆123Updated 6 months ago