NVIDIA-NeMo / GymLinks
Build RL environments for LLM training
β568Updated this week
Alternatives and similar repositories for Gym
Users that are interested in Gym are comparing it to the libraries listed below
Sorting:
- bloom - evaluate any behavior immediately Β πΈπ±β1,027Updated this week
- Code for paper "The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning"β329Updated last month
- Developer Asset Hub for NVIDIA Nemotron β A one-stop resource for training recipes, usage cookbooks, and full end-to-end reference examplβ¦β314Updated this week
- Post-training with Tinkerβ2,699Updated this week
- OpenTinker is an RL-as-a-Service infrastructure for foundation modelsβ499Updated last week
- PyTorch-native post-training at scaleβ584Updated this week
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.β792Updated 2 weeks ago
- A Lightweight LLM Post-Training Libraryβ2,092Updated this week
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5Bβ559Updated last month
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ovβ¦β480Updated last week
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)β720Updated 3 weeks ago
- β126Updated 3 months ago
- A benchmark for LLMs on complicated tasks in the terminalβ1,305Updated 2 weeks ago
- Scalable toolkit for efficient model reinforcementβ1,210Updated this week
- β301Updated 5 months ago
- β1,257Updated last month
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's Tβ¦β323Updated 4 months ago
- open-source coding LLM for software engineering tasksβ1,087Updated 3 months ago
- β854Updated 3 months ago
- OpenCUA: Open Foundations for Computer-Use Agentsβ627Updated last week
- Async RL Training at Scaleβ985Updated this week
- ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.β450Updated 2 weeks ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data generaβ¦β250Updated this week
- π Loong: Synthesize Long CoTs at Scale through Verifiers.β478Updated last week
- Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"β544Updated 2 months ago
- Next paradigm for LLM Agent. Unify plan and action through recursive code generation for adaptive, human-like decision-making.β523Updated last month
- An interface library for RL post training with environments.β973Updated this week
- Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.β722Updated 7 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agentsβ509Updated this week
- An Open-Source Large-Scale Reinforcement Learning Project for Search Agentsβ530Updated last month