kyegomez / EvoVLM-JPLinks
Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI
☆31Updated last year
Alternatives and similar repositories for EvoVLM-JP
Users that are interested in EvoVLM-JP are comparing it to the libraries listed below
Sorting:
- Unofficial Implementation of Evolutionary Model Merging☆41Updated last year
- ☆100Updated last year
- ☆75Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)☆147Updated last year
- The official implementation of Self-Exploring Language Models (SELM)☆63Updated last year
- ☆69Updated last year
- A repository for research on medium sized language models.☆78Updated last year
- ☆203Updated 11 months ago
- ☆109Updated last year
- ☆124Updated 9 months ago
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆51Updated last year
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆123Updated last year
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆178Updated last year
- ☆33Updated 10 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆180Updated 5 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 10 months ago
- ☆88Updated last year
- This is the official repository for Inheritune.☆115Updated 9 months ago
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆56Updated 4 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆194Updated last year
- Language models scale reliably with over-training and on downstream tasks☆100Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆101Updated last year
- ☆85Updated 2 weeks ago
- ☆55Updated last year
- LLM-Merging: Building LLMs Efficiently through Merging☆205Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆160Updated 10 months ago
- [NeurIPS 2024] GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations☆67Updated last year
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆61Updated last year