camel-ai/seta-env

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/camel-ai/seta-env)

camel-ai / seta-env

💻 SETA: Scaling Environments for Terminal Agents - Environments

☆143

Alternatives and similar repositories for seta-env

Users that are interested in seta-env are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

camel-ai / seta
View on GitHub
💻 SETA: Scaling Environments for Terminal Agents
☆128Updated this week
kanishkg / endless-terminals
View on GitHub
☆135Mar 31, 2026Updated 3 months ago
Danau5tin / terminal-bench-rl
View on GitHub
GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…
☆398Aug 24, 2025Updated 11 months ago
harbor-framework / awesome-harbor
View on GitHub
A curated list of awesome Harbor ecosystem projects
☆48May 29, 2026Updated 2 months ago
open-thoughts / OpenThoughts-TBLite
View on GitHub
A Difficulty-Calibrated Benchmark for Building Terminal Agents
☆29Feb 20, 2026Updated 5 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
FrontierCS / FrontierSmith
View on GitHub
FrontierSmith, a new system that uses AI to synthesize open-ended coding problems at scale
☆50May 30, 2026Updated last month
Danau5tin / tbench-agentic-data-pipeline
View on GitHub
Multi-agent synthetic data generation pipeline capable of generating and validating long horizon terminal/coding tasks for RL training
☆71Jul 28, 2025Updated last year
alibaba / terminal-bench-pro
View on GitHub
☆119Apr 1, 2026Updated 3 months ago
SWE-Gym / SWE-Gym
View on GitHub
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
☆712Jul 29, 2025Updated last year
R2E-Gym / R2E-Gym
View on GitHub
[COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents
☆312Jul 13, 2025Updated last year
abundant-ai / SWE-gen
View on GitHub
Convert GitHub PRs into Harbor tasks
☆72Jul 13, 2026Updated 2 weeks ago
RUCAIBox / SWE-World
View on GitHub
☆49Mar 6, 2026Updated 4 months ago
harbor-framework / harbor-datasets
View on GitHub
☆36May 16, 2026Updated 2 months ago
SWE-bench / SWE-smith
View on GitHub
[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents
☆717Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
harbor-framework / terminal-bench-science
View on GitHub
Terminal-Bench Science: Evaluating AI Agents on Complex Real-World Scientific Workflows in the Terminal
☆219Updated this week
hkust-nlp / Toolathlon
View on GitHub
[ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
☆443Updated this week
harbor-framework / frontier-bench
View on GitHub
Measuring and evolving with the frontier of agent work
☆416Updated this week
harbor-framework / harbor
View on GitHub
Framework for evaluating and improving agents
☆3,647Updated this week
PrimeIntellect-ai / prime-rl
View on GitHub
Agentic RL Training at Scale
☆1,759Updated this week
harbor-framework / terminal-bench
View on GitHub
A benchmark for LLMs on complicated tasks in the terminal
☆2,499Jul 11, 2026Updated 2 weeks ago
abundant-ai / swe-marathon
View on GitHub
SWE-Marathon: an ultra long-horizon SWE benchmark
☆117Updated this week
harbor-framework / terminal-bench-2
View on GitHub
☆349Apr 30, 2026Updated 2 months ago
ucsb-mlsec / DevOps-Gym
View on GitHub
☆28Mar 11, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
NVIDIA-NeMo / ProRL-Agent-Server
View on GitHub
Agentic RL on Any Harness at Scale
☆716Jul 15, 2026Updated 2 weeks ago
aisa-group / PostTrainBench
View on GitHub
Measuring how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours
☆475Jul 22, 2026Updated last week
DeepSoftwareAnalytics / swe-factory
View on GitHub
[FSE'2026] SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks
☆183May 12, 2026Updated 2 months ago
LMCache / lmcache-agent-trace
View on GitHub
Agent application/benchmark/workload traces should be placed here.
☆15Apr 13, 2026Updated 3 months ago
harbor-framework / terminal-bench-challenges
View on GitHub
☆19Jun 18, 2026Updated last month
swt-user / DMPO
View on GitHub
☆54Oct 10, 2024Updated last year
FuRuF-11 / AIT
View on GitHub
A repository to introduce the algorithmic information theory. You could learn what is Kolmogorov complexity and why it is important here.
☆13Jul 23, 2025Updated last year
KillerShoaib / RLM-From-Scratch
View on GitHub
Implementation of Recursive Language Model paper from scratch
☆46Feb 10, 2026Updated 5 months ago
xirui-li / ClawEnvKit
View on GitHub
Open-source Environment toolkit of claw-like agents, support task/harness generation and evaluation
☆58May 7, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
open-tinker / OpenTinker
View on GitHub
OpenTinker is an RL-as-a-Service infrastructure for foundation models
☆677Mar 21, 2026Updated 4 months ago
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,696Updated this week
hamishivi / tmax
View on GitHub
Training terminal-agents
☆254Jul 22, 2026Updated last week
NovaSky-AI / SkyRL
View on GitHub
SkyRL: A Modular Full-stack RL Library for LLMs
☆2,102Updated this week
facebookresearch / swe-rl
View on GitHub
[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"
☆712Mar 16, 2025Updated last year
eigent-ai / toolathlon_gym
View on GitHub
Toolathlon-Gym for testing AI agents real-world tool-use capabilities across diverse MCP servers.
☆141Jul 22, 2026Updated last week
radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,809Updated this week