SakanaAI / evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
☆1,315Updated 4 months ago
Alternatives and similar repositories for evolutionary-model-merge:
Users that are interested in evolutionary-model-merge are comparing it to the libraries listed below
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,378Updated last year
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,148Updated 11 months ago
- Codebase for Merging Language Models (ICML 2024)☆816Updated 11 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆862Updated 2 months ago
- Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch☆1,795Updated 3 weeks ago
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆1,542Updated 5 months ago
- Code for Quiet-STaR☆730Updated 8 months ago
- Tools for merging pretrained large language models.☆5,571Updated this week
- Reaching LLaMA2 Performance with 0.1M Dollars☆981Updated 9 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,438Updated last week
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆833Updated last week
- Schedule-Free Optimization in PyTorch☆2,142Updated last week
- Recipes to scale inference-time compute of open models☆1,058Updated 2 months ago
- Official repository for ORPO☆448Updated 10 months ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,984Updated 8 months ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,508Updated last year
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,040Updated 2 months ago
- Training LLMs with QLoRA + FSDP☆1,472Updated 5 months ago
- A bibliography and survey of the papers surrounding o1☆1,187Updated 5 months ago
- ☆1,015Updated 4 months ago
- Minimalistic large language model 3D-parallelism training☆1,793Updated this week
- [NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward☆873Updated 2 months ago
- 0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" i…☆296Updated last year
- Stanford NLP Python library for Representation Finetuning (ReFT)☆1,463Updated 2 months ago
- A library for advanced large language model reasoning☆2,099Updated 2 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,062Updated 3 months ago
- YaRN: Efficient Context Window Extension of Large Language Models☆1,470Updated last year
- A PyTorch native library for large-scale model training☆3,627Updated this week
- ☆444Updated last year
- Mamba-Chat: A chat LLM based on the state-space model architecture 🐍☆922Updated last year