kyegomez / EvoVLM-JPLinks

Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI

☆30

Alternatives and similar repositories for EvoVLM-JP

Users that are interested in EvoVLM-JP are comparing it to the libraries listed below

Sorting:

fangyuan-ksgk / Evolutionary-Model-Merge
Unofficial Implementation of Evolutionary Model Merging
☆39Updated last year
katiekang1998 / reasoning_generalization
☆33Updated 6 months ago
minyoungg / LTE
☆68Updated last year
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆98Updated 9 months ago
dvlab-research / MR-GSM8K
Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs
☆49Updated last year
FasterDecoding / BitDelta
☆199Updated 7 months ago
kyleliang919 / Online-Subspace-Descent
[NeurIPS 2024] Low rank memory efficient optimizer without SVD
☆30Updated 2 weeks ago
jeffreysijuntan / lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
☆117Updated last year
JacobPfau / fillerTokens
☆66Updated last year
lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆177Updated 3 weeks ago
RobertCsordas / moeut
☆82Updated 10 months ago
ScalingIntelligence / large_language_monkeys
☆96Updated 9 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆97Updated last year
architsharma97 / dpo-rlaif
☆98Updated last year
SalesforceAIResearch / LaTRO
☆117Updated 4 months ago
lucidrains / PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆127Updated 10 months ago
shenao-zhang / SELM
The official implementation of Self-Exploring Language Models (SELM)
☆64Updated last year
wuhy68 / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆144Updated 9 months ago
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 6 months ago
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆111Updated 5 months ago
Parallel-Reasoning / APR
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆114Updated 2 months ago
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆103Updated 2 months ago
SalesforceAIResearch / GemFilter
☆80Updated 6 months ago
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated 10 months ago
Linear95 / SPAG
Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024
☆137Updated 4 months ago
vicksEmmanuel / latent-gemma
☆26Updated 6 months ago
google-deepmind / asyncdiloco
☆45Updated last year
google-deepmind / bbeh
☆83Updated 2 months ago
clinicalml / co-llm
Co-LLM: Learning to Decode Collaboratively with Multiple Language Models
☆116Updated last year
VITA-Group / WeLore
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…
☆47Updated 2 months ago