VsonicV / es-fine-tuning-paperView external linksLinks
This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"
☆295Updated this week
Alternatives and similar repositories for es-fine-tuning-paper
Users that are interested in es-fine-tuning-paper are comparing it to the libraries listed below
Sorting:
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval☆23Jun 28, 2025Updated 7 months ago
- ☆27Jul 18, 2025Updated 6 months ago
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆22Nov 13, 2025Updated 3 months ago
- MB-X.01 · Logical Origin Node (L.O.N.) — TruthΩ → Co⁺ → Score⁺. Demo e spec verificabili. https://massimiliano.neocities.org/☆54Feb 3, 2026Updated 2 weeks ago
- ☆140Sep 29, 2025Updated 4 months ago
- Code for paper "Analog Foundation Models"☆30Sep 18, 2025Updated 4 months ago
- [ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective☆232Jan 26, 2026Updated 3 weeks ago
- Ludic – an LLM-RL library for the era of experience☆58Jan 9, 2026Updated last month
- Resa: Transparent Reasoning Models via SAEs☆47Sep 23, 2025Updated 4 months ago
- Enemies for your LLM☆35Jan 20, 2026Updated 3 weeks ago
- ☆37Aug 4, 2025Updated 6 months ago
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆37Oct 7, 2025Updated 4 months ago
- [ICLR2026] Test-Time Scaling with Reflective Generative Model☆301Jan 28, 2026Updated 2 weeks ago
- 🔥 Learn Svelte starter☆29Nov 3, 2025Updated 3 months ago
- ☆28Feb 4, 2026Updated last week
- Async RL Training at Scale☆1,071Updated this week
- ☆18Jun 20, 2025Updated 7 months ago
- Scaling RL on advanced reasoning models☆665Oct 20, 2025Updated 3 months ago
- Storing long contexts in tiny caches with self-study☆237Dec 5, 2025Updated 2 months ago
- ROSA+: RWKV's ROSA implementation with fallback statistical predictor☆32Oct 13, 2025Updated 4 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Jul 24, 2025Updated 6 months ago
- LLM training in simple, raw C/CUDA☆15Dec 5, 2024Updated last year
- A ComfyUI plugin that provides a user interface of StableStudio☆22Aug 15, 2025Updated 6 months ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆357Jun 23, 2025Updated 7 months ago
- Multi-Turn RL Training System with AgentTrainer for Language Model Game Reinforcement Learning☆59Dec 18, 2025Updated last month
- Training framework with a goal to explore the frontier of sample efficiency of small language models☆98Jan 25, 2026Updated 3 weeks ago
- ComfyUI for Audio☆40Sep 21, 2025Updated 4 months ago
- Language modeling with linear-cost context☆116Sep 25, 2025Updated 4 months ago
- Project code for training LLMs to write better unit tests + code☆21May 19, 2025Updated 8 months ago
- ☆439Jan 29, 2026Updated 2 weeks ago
- ☆20Mar 25, 2025Updated 10 months ago
- A version of the game Battleships, play against a monte-carlo simulation based AI☆19Jan 10, 2019Updated 7 years ago
- Code to estimate DunedinPACNI scores from FreeSurfer parcellations of brain MRI data.☆41Sep 20, 2025Updated 4 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆128Oct 9, 2025Updated 4 months ago
- ☆84Nov 22, 2025Updated 2 months ago
- ☆24Apr 3, 2025Updated 10 months ago
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆225Aug 20, 2025Updated 5 months ago
- RAG Agent for the ARC AGI Challenge☆20Jul 1, 2024Updated last year