convergence-ai / webgames
Challenges for general-purpose web-browsing AI agents
☆47Updated 2 months ago
Alternatives and similar repositories for webgames:
Users that are interested in webgames are comparing it to the libraries listed below
- Code for ScribeAgent paper☆57Updated 2 months ago
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆44Updated last month
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆54Updated 4 months ago
- ☆147Updated 2 months ago
- accompanying material for sleep-time compute paper☆77Updated last week
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆53Updated last month
- LLM reads a paper and produce a working prototype☆55Updated 3 weeks ago
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆35Updated last week
- ☆76Updated 6 months ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilities☆146Updated 2 months ago
- ☆50Updated 5 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆77Updated last month
- AGI SDK☆35Updated this week
- Open Agent Computer Interface☆68Updated 5 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆186Updated 3 weeks ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆88Updated 3 weeks ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆74Updated 3 weeks ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆67Updated 2 months ago
- ☆63Updated last month
- ☆11Updated 9 months ago
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆104Updated 3 weeks ago
- ☆114Updated 2 months ago
- Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory☆56Updated 3 weeks ago
- CursorCore: Assist Programming through Aligning Anything☆121Updated 2 months ago
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"☆50Updated 2 months ago
- Agent computer interface for AI software engineer.☆68Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 8 months ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆72Updated 3 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 3 months ago
- ☆85Updated last week