wendell0218 / GVA-Survey
Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms
☆16Updated 3 weeks ago
Alternatives and similar repositories for GVA-Survey:
Users that are interested in GVA-Survey are comparing it to the libraries listed below
- An Easy-to-use Hallucination Detection Framework for LLMs.☆58Updated 11 months ago
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆43Updated 3 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆97Updated last month
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆118Updated this week
- The official code repository for PRMBench.☆68Updated last month
- A Self-Training Framework for Vision-Language Reasoning☆73Updated 2 months ago
- ☆28Updated 6 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆125Updated 3 months ago
- ☆66Updated 9 months ago
- ☆54Updated 5 months ago
- ☆59Updated this week
- A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond☆42Updated this week
- Building a comprehensive and handy list of papers for GUI agents☆269Updated 2 weeks ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 5 months ago
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆100Updated 3 months ago
- The demo, code and data of FollowRAG☆70Updated 3 months ago
- ☆138Updated 2 weeks ago
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!☆25Updated last week
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆107Updated last week
- ☆30Updated 5 months ago
- The code of arxiv paper: "CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis"☆23Updated 2 months ago
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".☆233Updated this week
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆106Updated 8 months ago
- up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources☆107Updated last month
- InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks (ICML 2024)☆113Updated 3 months ago
- GitHub page for "Large Language Model-Brained GUI Agents: A Survey"☆139Updated this week
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆21Updated this week
- ☆30Updated last week
- Paper collections of multi-modal LLM for Math/STEM/Code.☆84Updated 2 weeks ago