xlang-ai/VideoAgentTrek

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xlang-ai/VideoAgentTrek)

xlang-ai / VideoAgentTrek

The official repo of VideoAgentTrek

☆57

Alternatives and similar repositories for VideoAgentTrek

Users that are interested in VideoAgentTrek are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xlang-ai / AgentNetTool
View on GitHub
This is the official code base of AgentNetTool in OpenCUA. Website: https://opencua.xlang.ai/
☆52Sep 3, 2025Updated 10 months ago
xlang-ai / computer-agent-arena
View on GitHub
[ICLR 2026] Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents
☆67Feb 26, 2026Updated 4 months ago
xlang-ai / AgentTrek
View on GitHub
[ICLR2025 Spotlight] Agent Trajectory Synthesis via Guiding Replay with Web Tutorials
☆60Feb 21, 2025Updated last year
xlang-ai / OSWorld-G
View on GitHub
[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis
☆172Jun 18, 2026Updated last month
xlang-ai / OpenCUA
View on GitHub
[NeurIPS 2025 Spotlight] OpenCUA: Open Foundations for Computer-Use Agents
☆804May 25, 2026Updated 2 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
xlang-ai / CUA-Gym-Hub
View on GitHub
CUA-Gym-Hub: mock web apps as reproducible RL training environments for computer-use agents
☆66Jul 9, 2026Updated 2 weeks ago
THUDM / SCALE-CUA
View on GitHub
Open-source framework for computer use agents: VeriGen verifiable task synthesis, online RL training (AgentRL), and OSWorld/ScienceBoard …
☆33Updated this week
ServiceNow / GroundCUA
View on GitHub
GroundCUA
☆129Mar 24, 2026Updated 4 months ago
xlang-ai / CUA-Gym
View on GitHub
Scalable pipeline for synthesizing verifiable RLVR training data for computer-use agents
☆180May 26, 2026Updated last month
meituan / EvoCUA
View on GitHub
EvoCUA: Evolving Computer Use Agent
☆332Mar 31, 2026Updated 3 months ago
xlang-ai / OSWorld-V2
View on GitHub
OSWorld 2.0: Benchmarking Computer Use Agents on Long-Horizon Real-World Tasks
☆200Updated this week
hao-ai-lab / research-agent
View on GitHub
☆17Feb 25, 2026Updated 4 months ago
WukLab / osworld-human
View on GitHub
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents
☆27May 17, 2026Updated 2 months ago
X-PLUG / ToolCUA
View on GitHub
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
☆58May 13, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
FanbinLu / STEVE-R1
View on GitHub
R1-like Computer-use Agent
☆91Mar 21, 2025Updated last year
xlang-ai / FineVLA
View on GitHub
Scalable annotation pipeline for action-aglined fine-grained instruciton for Visual-language-Action model
☆73Updated this week
microsoft / webgym
View on GitHub
This project includes code for using the AsyncWebRL and WebGym frameworks to train web agent models.
☆46Jun 9, 2026Updated last month
JIA-Lab-research / ARPO
View on GitHub
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆162May 29, 2025Updated last year
niuzaisheng / ScreenExplorer
View on GitHub
ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World
☆26Jun 17, 2025Updated last year
tyshiwo1 / Accelerating-T2I-AR-with-SJD
View on GitHub
[ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
☆52Apr 21, 2025Updated last year
sail-sg / SkyLadder
View on GitHub
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆43Dec 29, 2025Updated 6 months ago
Linn3a / siren
View on GitHub
Official implementation of Selective Entropy Regularization (SIREN), proposed by paper 'Rethinking Entropy Regularization in Large Reason…
☆32Dec 10, 2025Updated 7 months ago
RUCBM / GUICourse
View on GitHub
GUICourse: From General Vision Langauge Models to Versatile GUI Agents
☆143Mar 1, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
cmu-l3 / gym-anything
View on GitHub
Gym-Anything: Turn any Software into an Agent Environment
☆263Updated this week
cocoabench / cocoa-agent
View on GitHub
An agent framework for building and evaluating general digital agents.
☆41Apr 21, 2026Updated 3 months ago
CG-Bench / CG-Bench
View on GitHub
☆20Jan 26, 2025Updated last year
ai-agents-2030 / ViMo
View on GitHub
☆26Apr 2, 2026Updated 3 months ago
yunfeixie233 / ViGaL
View on GitHub
☆70Feb 4, 2026Updated 5 months ago
xlang-ai / OSWorld
View on GitHub
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
☆3,034Updated this week
OS-Copilot / OS-Genesis
View on GitHub
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆188Oct 8, 2025Updated 9 months ago
naimengye / speculative-action
View on GitHub
☆30Mar 9, 2026Updated 4 months ago
ArtemBaskal / model-based-testing-calculator
View on GitHub
Model Based Testing of the App Based On The Description from Constructing the User Interface with Statecharts Book of Ian Horrocks using …
☆13Feb 20, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
OS-Copilot / ScienceBoard
View on GitHub
[ICLR 2026] Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"
☆132Feb 2, 2026Updated 5 months ago
InternLM / Spatial-SSRL
View on GitHub
[CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"
☆133Apr 7, 2026Updated 3 months ago
visual-haystacks / mirage
View on GitHub
🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
☆27Feb 9, 2025Updated last year
OpenIXCLab / CODA
View on GitHub
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
☆37Aug 28, 2025Updated 10 months ago
WenyiWU0111 / CoMEM-Agent
View on GitHub
Official repository for paper Auto-scaling Continuous Memory for GUI Agent
☆29Feb 2, 2026Updated 5 months ago
mll-lab-nu / TStar
View on GitHub
TStar is a unified temporal search framework for long-form video question answering
☆97Mar 23, 2026Updated 4 months ago
EternityJune25 / MVISU-Bench
View on GitHub
[ACM MM 2025 🔥 Oral] MVISU-Bench: Benchmarking Mobile Agents for Real-World Tasks by Multi-App, Vague, Interactive, Single-App and Uneth…
☆15Mar 13, 2026Updated 4 months ago