WentseChen/Verlog

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WentseChen/Verlog)

WentseChen / Verlog

Verlog: A Multi-turn RL framework for LLM agents

☆73

Alternatives and similar repositories for Verlog

Users that are interested in Verlog are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

roger-creus / Wave-Defense-Learning-Environment
View on GitHub
A videogame made with PyGame turned into an Open AI Gym Learning Environment for Reinforcement Learning agents.
☆14Jan 3, 2023Updated 3 years ago
Li-ChangHao / CoNav
View on GitHub
☆12Jul 16, 2024Updated 2 years ago
yaof20 / verl
View on GitHub
verl: Volcano Engine Reinforcement Learning for LLMs
☆22Nov 6, 2025Updated 8 months ago
THUDM / AgentRL
View on GitHub
Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework
☆324Jan 17, 2026Updated 6 months ago
johnlime / RlkitExtension
View on GitHub
Collection of reinforcement learning algorithms
☆16Sep 29, 2025Updated 9 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
sjtu-sai-agents / Browse-Master
View on GitHub
Official implementation of Browse-Master, a tool-augmented web-search agent.
☆36Aug 22, 2025Updated 11 months ago
abdulhaim / LMRL-Gym
View on GitHub
☆116Jul 2, 2024Updated 2 years ago
WangHanLinHenry / SPA-RL-Agent
View on GitHub
Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"
☆89Sep 13, 2025Updated 10 months ago
inclusionAI / ASearcher
View on GitHub
An Open-Source Large-Scale Reinforcement Learning Project for Search Agents
☆602Nov 26, 2025Updated 8 months ago
spiral-rl / spiral
View on GitHub
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
☆199Mar 27, 2026Updated 3 months ago
balrog-ai / BALROG
View on GitHub
Benchmarking Agentic LLM and VLM Reasoning On Games
☆261Apr 9, 2026Updated 3 months ago
jinhangzhan / RL_Heals_SFT
View on GitHub
☆21Mar 22, 2026Updated 4 months ago
benellis3 / pymarl2
View on GitHub
Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
☆19Aug 20, 2023Updated 2 years ago
TextArena / UnstableBaselines
View on GitHub
☆120Apr 7, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
axon-rl / gem
View on GitHub
A Gym for Agentic LLMs
☆502Jan 21, 2026Updated 6 months ago
long-horizon-execution / measuring-execution
View on GitHub
☆57Mar 18, 2026Updated 4 months ago
Agent-One-Lab / AgentFly
View on GitHub
Scalable and extensible reinforcement learning for LM agents.
☆122May 6, 2026Updated 2 months ago
mll-lab-nu / RAGEN
View on GitHub
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
☆2,756Updated this week
microsoft / SmartPlay
View on GitHub
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …
☆146Apr 11, 2024Updated 2 years ago
nex-agi / NexRL
View on GitHub
NexRL is an ultra-loosely-coupled LLM post-training framework.
☆114Updated this week
LuLuLuyi / R-HORIZON
View on GitHub
[ICLR'2026] R-HORIZON: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
☆18Oct 21, 2025Updated 9 months ago
NovaSky-AI / SkyRL
View on GitHub
SkyRL: A Modular Full-stack RL Library for LLMs
☆2,093Updated this week
langfengQ / verl-agent
View on GitHub
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…
☆2,153Jun 9, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / sweet_rl
View on GitHub
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆271May 5, 2025Updated last year
inclusionAI / AWorld-RL
View on GitHub
Agentic Learning Powered by AWorld
☆117Jun 18, 2026Updated last month
joey00072 / Attention-as-graph
View on GitHub
alternative way to calculating self attention
☆18May 25, 2024Updated 2 years ago
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
WooooDyy / AgentGym-RL
View on GitHub
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…
☆820Feb 15, 2026Updated 5 months ago
kxfan2002 / Reagent
View on GitHub
Agent-RRM: Exploring Reasoning Reward Model for Agents
☆70Mar 17, 2026Updated 4 months ago
Bluedotdot2021 / PRML-book_review
View on GitHub
PRML Page-by-page配套资料，对PRML全书及各章节的review
☆17Apr 16, 2024Updated 2 years ago
Terra-Flux / PolyRL
View on GitHub
[NSDI'26] PolyRL is a reinforcement learning framework for LLM that harvest spot instances on the cloud to reduce cost.
☆19Mar 30, 2026Updated 3 months ago
pearls-lab / meow-tea-taro
View on GitHub
A Practitioner's Guide to M(eow)ti Turn Agentic ReinfOrcement learning
☆83Jan 16, 2026Updated 6 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
VAGOsolutions / SauerkrautLM-Doom-MultiVec
View on GitHub
A tiny 1.3M parameter model that plays DOOM, outperforming LLMs up to 92,000x its size.
☆26May 11, 2026Updated 2 months ago
nex-agi / NexHTML
View on GitHub
HTML Agent based on NexAU
☆16Nov 20, 2025Updated 8 months ago
maitrix-org / dynamic-alignment-optimization
View on GitHub
[EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…
☆24Nov 17, 2024Updated last year
Kwai-Klear / mini-swe-agent-plus
View on GitHub
mini-swe-agent-plus: a tiny (~100 LOC) GitHub issue fixer—now with a robust multi-line text edit tool.
☆25Jan 20, 2026Updated 6 months ago
thelongestusernameofall / 360-LLaMA-Factory
View on GitHub
adds Sequence Parallelism into LLaMA-Factory
☆12Dec 31, 2024Updated last year
MiroMindAI / MiroRL
View on GitHub
MiroRL is an MCP-first reinforcement learning framework for deep research agent.
☆246Aug 27, 2025Updated 10 months ago
zorazrw / agent-skill-induction
View on GitHub
Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"
☆42Apr 24, 2025Updated last year