llm-merging / LLM-Merging
LLM-Merging: Building LLMs Efficiently through Merging
☆191Updated 5 months ago
Alternatives and similar repositories for LLM-Merging:
Users that are interested in LLM-Merging are comparing it to the libraries listed below
- ☆95Updated 8 months ago
- ☆167Updated last year
- ☆89Updated last year
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆162Updated 3 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆140Updated 4 months ago
- ☆152Updated 3 weeks ago
- Steering vectors for transformer language models in Pytorch / Huggingface☆88Updated last week
- Code accompanying the paper "Massive Activations in Large Language Models"☆141Updated 11 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆101Updated 11 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆158Updated 2 months ago
- ☆171Updated last year
- Reproducible, flexible LLM evaluations☆166Updated 2 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆173Updated 5 months ago
- Direct Preference Optimization from scratch in PyTorch☆84Updated last year
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆126Updated 3 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆400Updated 10 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆175Updated 7 months ago
- ☆153Updated this week
- AI Logging for Interpretability and Explainability🔬☆105Updated 8 months ago
- ☆98Updated last month
- [ICML 2024] One Prompt is Not Enough: Automated Construction of a Mixture-of-Expert Prompts - TurningPoint AI☆18Updated 5 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆164Updated 2 weeks ago
- ☆118Updated 5 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆189Updated this week
- Language models scale reliably with over-training and on downstream tasks☆96Updated 11 months ago
- Code for Zero-Shot Tokenizer Transfer☆121Updated last month
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆208Updated last week
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆51Updated 5 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆40Updated 4 months ago