Danau5tin/tbench-agentic-data-pipeline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Danau5tin/tbench-agentic-data-pipeline)

Danau5tin / tbench-agentic-data-pipeline

Multi-agent synthetic data generation pipeline capable of generating and validating long horizon terminal/coding tasks for RL training

☆71

Alternatives and similar repositories for tbench-agentic-data-pipeline

Users that are interested in tbench-agentic-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Danau5tin / terminal-bench-rl
View on GitHub
GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…
☆398Aug 24, 2025Updated 11 months ago
abundant-ai / SWE-gen
View on GitHub
Convert GitHub PRs into Harbor tasks
☆72Jul 13, 2026Updated 2 weeks ago
alibaba / terminal-bench-pro
View on GitHub
☆119Apr 1, 2026Updated 3 months ago
kanishkg / endless-terminals
View on GitHub
☆135Mar 31, 2026Updated 3 months ago
camel-ai / seta
View on GitHub
💻 SETA: Scaling Environments for Terminal Agents
☆128Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
RUCAIBox / FIGA
View on GitHub
[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"
☆10May 5, 2024Updated 2 years ago
camel-ai / seta-env
View on GitHub
💻 SETA: Scaling Environments for Terminal Agents - Environments
☆143Feb 16, 2026Updated 5 months ago
phonism / CP-Zero
View on GitHub
Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.
☆18Apr 22, 2025Updated last year
harbor-framework / terminal-bench-challenges
View on GitHub
☆19Jun 18, 2026Updated last month
kwaipilot / SWE-Compass
View on GitHub
☆18Mar 28, 2026Updated 4 months ago
harbor-framework / terminal-bench
View on GitHub
A benchmark for LLMs on complicated tasks in the terminal
☆2,499Jul 11, 2026Updated 2 weeks ago
harbor-framework / harbor
View on GitHub
Framework for evaluating and improving agents
☆3,647Updated this week
open-thoughts / OpenThoughts-Agent
View on GitHub
Data recipes and robust infrastructure for training AI agents
☆272Updated this week
harbor-framework / harbor-datasets
View on GitHub
☆36May 16, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
icip-cas / LiteCoder
View on GitHub
Advancing Small and Medium-sized Code Agents.
☆17May 29, 2026Updated 2 months ago
LeiLiLab / HardTestGen
View on GitHub
☆17Jan 27, 2026Updated 6 months ago
sail-sg / Cheating-LLM-Benchmarks
View on GitHub
[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)
☆86Oct 23, 2024Updated last year
multi-swe-bench / MagentLess
View on GitHub
☆13Jul 31, 2025Updated 11 months ago
microsoft / SWE-bench-Live
View on GitHub
[NeurIPS 2025 D&B] 🚀 SWE-bench Goes Live!
☆214Jun 11, 2026Updated last month
sail-sg / regmix
View on GitHub
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
☆195Feb 17, 2025Updated last year
sail-sg / tty-use
View on GitHub
☆15Oct 13, 2025Updated 9 months ago
liuzuxin / Bullet-Safety-Gym
View on GitHub
An open-source framework to benchmark and assess safety specifications of Reinforcement Learning problems.
☆14Aug 25, 2023Updated 2 years ago
multimodal-art-projection / NL2RepoBench
View on GitHub
☆145May 13, 2026Updated 2 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
harbor-framework / awesome-harbor
View on GitHub
A curated list of awesome Harbor ecosystem projects
☆48May 29, 2026Updated 2 months ago
SWE-bench / sb-cli
View on GitHub
Run SWE-bench evaluations remotely
☆78Aug 14, 2025Updated 11 months ago
MiniMax-AI / mini-vela
View on GitHub
☆37Apr 2, 2026Updated 3 months ago
SWE-bench / SWE-smith
View on GitHub
[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents
☆717Updated this week
TuringEnterprises / SWE-Bench-plus-plus
View on GitHub
SWE-Bench-plus-plus
☆25Feb 5, 2026Updated 5 months ago
Hambaobao / SWE-Flow
View on GitHub
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner
☆40Jun 29, 2025Updated last year
NoCode-bench / NoCode-bench
View on GitHub
☆21May 20, 2026Updated 2 months ago
bytedance / Repo2Run
View on GitHub
Repo2Run is an LLM-based agent that automates environment configuration by generating error-free Dockerfiles for Python repositories.
☆196Jun 10, 2026Updated last month
hkust-nlp / Toolathlon
View on GitHub
[ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
☆443Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
satori-reasoning / Satori-SWE
View on GitHub
☆21May 30, 2025Updated last year
EuniAI / awesome-code-agents
View on GitHub
A living collection of frontier research on code agents, from the code we build to the worlds we act in.
☆117Updated this week
adobe-research / ImageFolder
View on GitHub
☆20Dec 8, 2024Updated last year
SWE-Gym / SWE-Gym
View on GitHub
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
☆712Jul 29, 2025Updated last year
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
SWE-Perf / SWE-Perf
View on GitHub
☆52Oct 28, 2025Updated 9 months ago
sail-sg / sailor2
View on GitHub
🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
☆73Mar 21, 2025Updated last year