sunblaze-ucb/AgentSynth

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sunblaze-ucb/AgentSynth)

sunblaze-ucb / AgentSynth

[ICLR 2026] AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents

☆49

Alternatives and similar repositories for AgentSynth

Users that are interested in AgentSynth are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cxcscmu / deepresearch_benchmarking
View on GitHub
☆29Mar 10, 2026Updated 4 months ago
Yarayx / livelongbench
View on GitHub
The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…
☆12Jun 28, 2025Updated last year
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
OSU-NLP-Group / ACuRL
View on GitHub
An Autonomous Curriculum Reinforcement Learning framework that steers agents to continually learn in specific environments with zero huma…
☆38Jun 7, 2026Updated last month
DualityRL / multi-attempt
View on GitHub
☆19Mar 10, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
QingFei1 / R-Search
View on GitHub
[ACL 2026] R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning
☆35Jan 4, 2026Updated 6 months ago
WebChoreArena / WebChoreArena
View on GitHub
COLM2026
☆36Jul 9, 2026Updated 2 weeks ago
bigai-nlco / RuleReasoner
View on GitHub
[ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling
☆39Feb 25, 2026Updated 5 months ago
SimengSun / ChapterBreak
View on GitHub
☆12Jun 5, 2024Updated 2 years ago
LaoKuiZe / AppAgent-Pro
View on GitHub
☆16Aug 27, 2025Updated 10 months ago
sled-group / 3D-GRAND
View on GitHub
[CVPR 2025] 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
☆54Jun 13, 2024Updated 2 years ago
OSU-NLP-Group / Explorer
View on GitHub
[ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
☆29Feb 17, 2026Updated 5 months ago
jmanhype / ace-playbook
View on GitHub
Self-improving LLM system using Generator-Reflector-Curator pattern for online learning from execution feedback
☆36Jun 9, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mbzuai-oryx / Agent-X
View on GitHub
ICLR 2026: Agent-X Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks
☆43Apr 28, 2026Updated 2 months ago
Infini-AI-Lab / GRESO
View on GitHub
☆82Jun 8, 2026Updated last month
psunlpgroup / ReaLMistake
View on GitHub
This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
☆32Aug 18, 2024Updated last year
limenlp / SEA
View on GitHub
Official Implementation for the paper "Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base"
☆27Sep 2, 2025Updated 10 months ago
marcfargas / pi-tramp
View on GitHub
TRAMP-like transparent remote execution for pi — tools run remotely via SSH/Docker, pi stays local
☆20May 23, 2026Updated 2 months ago
UMass-Embodied-AGI / BudgetGuidance
View on GitHub
[ACL'26 Findings] Steering LLM Thinking with Budget Guidance
☆33Feb 19, 2026Updated 5 months ago
JIA-Lab-research / ARPO
View on GitHub
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆162May 29, 2025Updated last year
RUCAIBox / R1-Searcher-plus
View on GitHub
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning
☆82May 25, 2025Updated last year
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
tangzhy / RealCritic
View on GitHub
☆15Jan 27, 2025Updated last year
LinxinS97 / NLPBench
View on GitHub
NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models
☆10Oct 27, 2023Updated 2 years ago
rhyang2021 / ARIA
View on GitHub
Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
☆30Aug 9, 2025Updated 11 months ago
UCDvision / low-budget-al
View on GitHub
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".
☆14Dec 22, 2021Updated 4 years ago
lilakk / BLEUBERI
View on GitHub
Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"
☆32Jun 5, 2025Updated last year
luka-group / vlm-knowledge-conflict
View on GitHub
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆54Oct 19, 2024Updated last year
uclaml / COPS
View on GitHub
The official implementation of Cross-Task Experience Sharing (COPS)
☆29Oct 23, 2024Updated last year
horizon-llm / Think-RM
View on GitHub
[NeurIPS 2025] Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
☆17Nov 2, 2025Updated 8 months ago
Euphoria16 / UI-Genie
View on GitHub
[NeurIPS 2025] UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents
☆60Nov 27, 2025Updated 7 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ServiceNow / typed-dag
View on GitHub
Causal discovery with typed directed acyclic graphs (t-DAG). This is a ServiceNow Research project that was started at Element AI.
☆13Jul 6, 2023Updated 3 years ago
spiral-rl / spiral
View on GitHub
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
☆199Mar 27, 2026Updated 3 months ago
ritikamangla / QSalience
View on GitHub
https://arxiv.org/abs/2404.10917
☆14Mar 18, 2025Updated last year
LARS-research / TREFE
View on GitHub
Searching a High Performance Feature Extractor for Text Recognition Network. TPAMI 2022
☆13Nov 25, 2022Updated 3 years ago
GAIR-NLP / LIMOPro
View on GitHub
☆15May 27, 2025Updated last year
linkedin / ControlLLM
View on GitHub
Control LLM
☆23Apr 6, 2025Updated last year
Jaiy / Ground-aware-Seg
View on GitHub
Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving. ACM Multimedia 2019.
☆12Sep 19, 2019Updated 6 years ago