xbmxb / CoCo-Agent
☆22Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for CoCo-Agent
- GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes fr…☆69Updated last week
- ☆21Updated last month
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆84Updated 4 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆123Updated last week
- Official implementation for "Android in the Zoo: Chain-of-Action-Thought for GUI Agents" (Findings of EMNLP 2024)☆48Updated last month
- The Official Code Repository for GUI-World.☆41Updated 3 months ago
- A curated list of the papers, repositories, tutorials, and anythings related to the large language models for tools☆65Updated last year
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆38Updated this week
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆47Updated last month
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆51Updated 3 months ago
- Touchstone: Evaluating Vision-Language Models by Language Models☆78Updated 10 months ago
- ☆84Updated 11 months ago
- An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation☆93Updated 10 months ago
- ☆58Updated 9 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆74Updated 6 months ago
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆33Updated last year
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆69Updated last month
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆59Updated 5 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆38Updated 7 months ago
- A Survey on the Honesty of Large Language Models☆47Updated last month
- ☆54Updated 2 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆55Updated 3 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆57Updated 3 months ago
- ☆25Updated 9 months ago
- ☆121Updated 3 weeks ago
- Data for evaluating GPT-4V☆11Updated last year
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆97Updated last month
- An Easy-to-use Hallucination Detection Framework for LLMs.☆48Updated 7 months ago
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆46Updated 2 weeks ago