Visual-Agent / DeepEyesV2Links
☆449Updated this week
Alternatives and similar repositories for DeepEyesV2
Users that are interested in DeepEyesV2 are comparing it to the libraries listed below
Sorting:
- A Scientific Multimodal Foundation Model☆620Updated 2 months ago
- Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"☆539Updated last month
- 🛠️ DeepAgent: A General Reasoning Agent with Scalable Toolsets☆874Updated last month
- This repository collects and organises state‑of‑the‑art papers on spatial reasoning for Multimodal Vision–Language Models (MVLMs).☆249Updated last week
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆710Updated last week
- Agent0 Series: Self-Evolving Agents from Zero Data☆898Updated this week
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆335Updated 6 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆244Updated last month
- A reproduction of the Deepseek-OCR model including training☆200Updated last month
- 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal rei…☆191Updated 2 weeks ago
- Next paradigm for LLM Agent. Unify plan and action through recursive code generation for adaptive, human-like decision-making.☆515Updated 3 weeks ago
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆130Updated 3 months ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆537Updated 3 months ago
- Latent Collaboration in Multi-Agent Systems☆615Updated last week
- OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.☆607Updated last month
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆508Updated 3 months ago
- Fully Open Framework for Democratized Multimodal Training☆662Updated last week
- ☆852Updated 3 months ago
- 🐉 Loong: Synthesize Long CoTs at Scale through Verifiers.☆476Updated last month
- Official Repository for PosterGen☆201Updated last month
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆246Updated 3 months ago
- OpenCUA: Open Foundations for Computer-Use Agents☆608Updated last week
- MiMo-VL☆611Updated 4 months ago
- AgentFlow: In-the-Flow Agentic System Optimization☆1,425Updated last week
- [EMNLP 2025] Awesome RAG Reasoning Resources☆369Updated 5 months ago
- Official repository for DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research☆476Updated this week
- Repo for "Adaptation of Agentic AI"☆381Updated this week
- The paper list of "Memory in the Age of AI Agents: A Survey"☆507Updated last week
- This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.☆104Updated 2 months ago
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…☆301Updated last month