π¦οΈ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/
β416Apr 17, 2026Updated 2 weeks ago
Alternatives and similar repositories for crab
Users that are interested in crab are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π« CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.orgβ16,869Updated this week
- π€ The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"β117Apr 6, 2025Updated last year
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]β149Nov 26, 2024Updated last year
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ160Feb 11, 2025Updated last year
- π¦ OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automationβ19,719Apr 17, 2026Updated 2 weeks ago
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- π Loong: Synthesize Long CoTs at Scale through Verifiers.β502Updated this week
- ποΈ OASIS: Open Agent Social Interaction Simulations with One Million Agents.β4,499Updated this week
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β822Oct 4, 2025Updated 7 months ago
- A Framework for Evaluating AI Agent Safety in Realistic Environmentsβ32Oct 2, 2025Updated 7 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ118Apr 27, 2026Updated last week
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β846Feb 3, 2025Updated last year
- A toolkit for building computer use AI agentsβ193Jun 26, 2025Updated 10 months ago
- OpenResearcher, an advanced Scientific Research Assistantβ504Oct 10, 2024Updated last year
- An autoagentic AGI that is self-evolving and modular.β965Sep 4, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- CAMEL framework-based multi-agent system for task-driven and dynamic environmentsβ111May 21, 2024Updated last year
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-upsβ51Dec 23, 2024Updated last year
- Code for the paper π³ Tree Search for Language Model Agentsβ222Jul 25, 2024Updated last year
- [ICLR 2025] Automated Design of Agentic Systemsβ1,565Jan 28, 2025Updated last year
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agentsβ232Jun 16, 2025Updated 10 months ago
- Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"β477Mar 19, 2024Updated 2 years ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,817Apr 25, 2026Updated last week
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"β70Dec 9, 2024Updated last year
- E2B Desktop Sandbox for LLMs. E2B Sandbox with desktop graphical environment that you can connect to any LLM for secure computer use.β1,359Updated this week
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An agent orchestration framework for economic agentsβ116Aug 12, 2025Updated 8 months ago
- The Library for LLM-based multi-agent applicationsβ102Jul 18, 2025Updated 9 months ago
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.β20Jun 3, 2024Updated last year
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"β65Oct 19, 2024Updated last year
- The model, data and code for the visual GUI Agent SeeClickβ478Jul 13, 2025Updated 9 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ2,042Dec 22, 2024Updated last year
- GUICourse: From General Vision Langauge Models to Versatile GUI Agentsβ142Mar 1, 2026Updated 2 months ago
- An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agentsβ5,916Sep 26, 2024Updated last year
- π The First Self-Improving Agentic Solutionβ988Feb 5, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including Cβ¦β5,512Mar 19, 2026Updated last month
- A lightweight framework for building LLM-based agentsβ2,243Updated this week
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β246May 5, 2024Updated 2 years ago
- (ICLR 2025) The Official Code Repository for GUI-World.β69Dec 18, 2024Updated last year
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chainβ107Mar 14, 2024Updated 2 years ago
- β171Jan 25, 2024Updated 2 years ago
- π» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.β1,175Aug 17, 2025Updated 8 months ago