Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"
☆157Aug 27, 2025Updated 9 months ago
Alternatives and similar repositories for JarvisVLA
Users that are interested in JarvisVLA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models☆20Jun 4, 2025Updated last year
- ☆49Oct 21, 2025Updated 7 months ago
- ☆30Jun 25, 2024Updated last year
- MineStudio: A Streamlined Package for Minecraft AI Agent Development☆384May 12, 2026Updated last month
- GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR'24, Spotlight)☆67Dec 18, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Implementation of Paper "ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment" (AAAI'26)☆41Jul 2, 2025Updated 11 months ago
- Paper List of Minecraft Agents☆69May 24, 2026Updated 3 weeks ago
- official repo for AGNOSTOS, a cross-task manipulation benchmark, and X-ICM method, a cross-task in-context manipulation (VLA) method☆69May 28, 2026Updated 3 weeks ago
- Official implementation of paper "ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting" (CVPR'25)☆46Apr 13, 2025Updated last year
- [ICCV 2025] CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games☆70Nov 19, 2025Updated 6 months ago
- MLLM @ Game☆17May 12, 2025Updated last year
- Awesome_CV的中文版本,clone本项目到overleaf即可轻松愉快编写自己的CV☆17May 24, 2024Updated 2 years ago
- [EMNLP 2022] Adapting a Language Model While Preserving its General Knowledge☆21Feb 12, 2023Updated 3 years ago
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agen…☆293Aug 3, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆64Oct 9, 2024Updated last year
- Repo for Paper "OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft"☆37Jun 5, 2026Updated last week
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆101Jun 17, 2025Updated last year
- ☆73May 23, 2025Updated last year
- AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World | CoRL 2025☆97Mar 26, 2026Updated 2 months ago
- We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effe…☆25Feb 10, 2024Updated 2 years ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 11 months ago
- This project is the official implementation of 'DreamOmni3: Scribble-based Editing and Generation''☆40Dec 30, 2025Updated 5 months ago
- [CVPR 2025] Official Implementation for Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy☆25Jun 17, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models☆34Nov 2, 2025Updated 7 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- Paper & Project lists of cutting-edge research on visual navigation and embodied AI.☆36Apr 27, 2023Updated 3 years ago
- Official Implementation of "LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis"☆84Aug 25, 2025Updated 9 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆18Apr 2, 2025Updated last year
- ☆17Aug 1, 2025Updated 10 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 9 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆17Dec 19, 2024Updated last year
- [ICLR26] AI-based scaling law discovery☆28Jan 30, 2026Updated 4 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"☆30May 12, 2026Updated last month
- [ICLR 2026🔥] MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head☆151May 19, 2026Updated 3 weeks ago
- Aligning Agentic World Models via Knowledgeable Experience Learning☆36May 15, 2026Updated last month
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- Introduction about AWESOME_ENTROPY+LRM_PAPERS☆31Dec 16, 2025Updated 6 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆42Feb 12, 2025Updated last year
- support Large Vocabulary Instance Segmentation (LVIS) dataset for mmdetection☆16Apr 24, 2020Updated 6 years ago