gimme1dollar / b-moca
Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation)
☆23Updated last month
Related projects: ⓘ
- SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …☆115Updated 5 months ago
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos☆54Updated 9 months ago
- A Universal Platform for Training and Evaluation of Mobile Interaction☆31Updated last month
- Official implementation of the DECKARD Agent from the paper "Do Embodied Agents Dream of Pixelated Sheep?"☆84Updated last year
- The official implementation of the paper "Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction".☆32Updated 7 months ago
- AndroidWorld is an environment and benchmark for autonomous agents☆86Updated this week
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents☆81Updated last week
- Implementation of "Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction"☆42Updated last year
- ☆65Updated 2 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆84Updated 5 months ago
- ☆131Updated 4 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆48Updated 3 weeks ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆41Updated 3 months ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆24Updated last week
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆73Updated 2 months ago
- The source code of the paper "Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Pla…☆69Updated last month
- ☆102Updated 2 months ago
- ☆37Updated 9 months ago
- Code for Contrastive Preference Learning (CPL)☆147Updated 6 months ago
- [ICLR 2024] Code for the paper "Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning"☆113Updated 8 months ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆35Updated 8 months ago
- Official Repo of LangSuitE☆74Updated last month
- An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic mult…☆61Updated last year
- BASALT Benchmark datasets, evaluation code and agent training example.☆19Updated 9 months ago
- Guide Your Agent with Adaptive Multimodal Rewards (NeurIPS 2023 Accepted)☆32Updated 11 months ago
- ☆23Updated 4 months ago
- A RL approach to enable cost-effective, intelligent interactions between a local agent and a remote LLM☆60Updated 3 weeks ago
- Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.☆200Updated last month
- Implementation of TWOSOME☆42Updated 4 months ago
- ☆21Updated 2 months ago