[NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.
☆38Apr 23, 2026Updated 2 months ago
Alternatives and similar repositories for InfantAgent
Users that are interested in InfantAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The raw UserRL repo under construction☆104Jun 2, 2026Updated 3 weeks ago
- ☆32Jul 3, 2025Updated 11 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated last year
- ☆21Apr 3, 2025Updated last year
- Website for HKU NLP group (under construction)☆14Mar 20, 2026Updated 3 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆18Mar 2, 2026Updated 3 months ago
- ☆130Oct 3, 2025Updated 8 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆159May 29, 2025Updated last year
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆32Mar 6, 2026Updated 3 months ago
- ☆33May 9, 2025Updated last year
- Benchmark of complex, multimodal desktop-oriented tasks for advanced GUI-navigation AI agents☆23May 7, 2025Updated last year
- ☆12Aug 22, 2023Updated 2 years ago
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆29Feb 25, 2025Updated last year
- Official Repo of Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents☆84Jun 2, 2026Updated 3 weeks ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- This repo contains the dataset for paper: Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code☆15Dec 1, 2023Updated 2 years ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated last year
- HarnessX is a harness foundry: forge any number of agent harnesses from reusable processors and bundles, pair each with any model, and ev…☆109Jun 17, 2026Updated 2 weeks ago
- ☆36Jan 28, 2026Updated 5 months ago
- ☆16Jan 19, 2026Updated 5 months ago
- Utility to use eleven lab's streaming to in the command line☆11Aug 8, 2023Updated 2 years ago
- ☆22May 3, 2025Updated last year
- Advances and Frontiers of LLM-based Issue Resolution in Software Engineering A Comprehensive Survey☆84Apr 22, 2026Updated 2 months ago
- The paper list of multilingual pre-trained models (Continual Updated).☆25Jun 18, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ACL 2026 ] LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark☆48Apr 18, 2026Updated 2 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Jul 3, 2024Updated last year
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆29Jul 9, 2025Updated 11 months ago
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆15Feb 25, 2025Updated last year
- [NeurIPS 2024] MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems☆94Jul 24, 2024Updated last year
- [ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification☆49Mar 12, 2026Updated 3 months ago
- ☆14Mar 2, 2025Updated last year
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆103May 20, 2025Updated last year
- ☆55Apr 14, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Oct 22, 2024Updated last year
- ☆34Sep 19, 2025Updated 9 months ago
- ☆20Apr 24, 2024Updated 2 years ago
- Container-free RL framework for training software engineering agents☆67Mar 4, 2026Updated 3 months ago
- ☆13Oct 19, 2023Updated 2 years ago
- [ICLR 2026] JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence☆79May 9, 2026Updated last month
- [ACL 2025] Research code for the paper "OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents"☆21Jun 19, 2025Updated last year