EricLBuehler / xloraLinks
X-LoRA: Mixture of LoRA Experts
☆229Updated 10 months ago
Alternatives and similar repositories for xlora
Users that are interested in xlora are comparing it to the libraries listed below
Sorting:
- ☆264Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆144Updated 9 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆421Updated last year
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆175Updated this week
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆163Updated last year
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆159Updated 3 weeks ago
- Official repository for ORPO☆455Updated last year
- [ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆213Updated 3 weeks ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆222Updated last month
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆396Updated last year
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆357Updated 9 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains☆141Updated 2 weeks ago
- This is the official repository for Inheritune.☆111Updated 4 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆226Updated 7 months ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆213Updated last week
- minimal GRPO implementation from scratch☆90Updated 3 months ago
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆231Updated 10 months ago
- ☆300Updated 3 weeks ago
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆137Updated 11 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆158Updated 2 months ago
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆411Updated 8 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆463Updated last year
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆105Updated 4 months ago
- ☆109Updated 3 months ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆222Updated 3 months ago
- ☆183Updated last year
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆498Updated 5 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆383Updated 2 weeks ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆205Updated 2 weeks ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆158Updated last month