EricLBuehler / xlora
X-LoRA: Mixture of LoRA Experts
☆219Updated 9 months ago
Alternatives and similar repositories for xlora:
Users that are interested in xlora are comparing it to the libraries listed below
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆143Updated 7 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆158Updated 10 months ago
- ☆256Updated last year
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆409Updated last year
- ☆186Updated this week
- This is the official repository for Inheritune.☆111Updated 2 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆221Updated 6 months ago
- ☆176Updated last year
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆165Updated 4 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆395Updated 11 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆105Updated 2 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆196Updated last week
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆336Updated 2 weeks ago
- ☆198Updated 5 months ago
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆141Updated 2 weeks ago
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆148Updated 3 weeks ago
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆354Updated 8 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆176Updated 7 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆171Updated 2 months ago
- Unofficial Implementation of Evolutionary Model Merging☆38Updated last year
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆153Updated 3 weeks ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- ☆77Updated 3 months ago
- ☆176Updated 4 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆189Updated 5 months ago
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆120Updated 8 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆177Updated 2 months ago
- ☆95Updated last month
- The official repo for "LLoCo: Learning Long Contexts Offline"☆116Updated 10 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated last year