Visual-Agent/DeepEyesV2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Visual-Agent/DeepEyesV2)

Visual-Agent / DeepEyesV2

☆532

Alternatives and similar repositories for DeepEyesV2

Users that are interested in DeepEyesV2 are comparing it to the libraries listed below

Sorting:

Visual-Agent / DeepEyes
View on GitHub
☆1,137Nov 20, 2025Updated 3 months ago
EvolvingLMMs-Lab / LLaVA-OneVision-1.5-RL
View on GitHub
Fully Open Framework for Democratized Multimodal Reinforcement Learning.
☆43Dec 19, 2025Updated 2 months ago
icon-lab / MedTrim
View on GitHub
Official implementation of "Meta-Entity Driven Triplet Mining for Aligning Medical Vision-Language Models"
☆14Mar 19, 2025Updated 11 months ago
SiyuanYan1 / MAKE
View on GitHub
[MICCAI‘25 Early Accept] MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment
☆17Updated this week
microsoft / Do-You-See-Me
View on GitHub
☆11Jun 21, 2025Updated 8 months ago
ByteDance-BandAI / CodeVision
View on GitHub
[CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images
☆56Jan 23, 2026Updated last month
zhaochen0110 / OpenThinkIMG
View on GitHub
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
☆354Jun 1, 2025Updated 9 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,338Feb 3, 2026Updated last month
InternLM / CapRL
View on GitHub
[ICLR 2026] An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"
☆189Feb 8, 2026Updated 3 weeks ago
VTool-R1 / VTool-R1
View on GitHub
[ICLR 2026] "VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use"
☆156Feb 7, 2026Updated 3 weeks ago
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆405Jan 29, 2026Updated last month
XYPB / CLEFT
View on GitHub
Official Implementation of "CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning" on MIC…
☆18Feb 12, 2025Updated last year
evalops / cognitive-dissonance-dspy
View on GitHub
A multi-agent LLM system for detecting and resolving cognitive dissonance.
☆275Oct 14, 2025Updated 4 months ago
ali-vilab / TTS-VAR
View on GitHub
Test-time Scaling for VAR models
☆31Sep 19, 2025Updated 5 months ago
real-absolute-AI / NoisyRollout
View on GitHub
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆105Sep 18, 2025Updated 5 months ago
zhengxuJosh / Awesome-Multimodal-Spatial-Reasoning
View on GitHub
This repository collects and organises state‑of‑the‑art papers on spatial reasoning for Multimodal Vision–Language Models (MVLMs).
☆280Feb 17, 2026Updated 2 weeks ago
milkosten / task-mcp-server
View on GitHub
A MCP Task Server
☆11Mar 7, 2025Updated 11 months ago
MCG-NJU / VideoEval
View on GitHub
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
☆14Jul 31, 2025Updated 7 months ago
TIGER-AI-Lab / Pixel-Reasoner
View on GitHub
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
☆278Nov 6, 2025Updated 3 months ago
YueYANG1996 / LaBo
View on GitHub
CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
☆105May 28, 2024Updated last year
temporal-community / temporal-videogen
View on GitHub
Generate videos using Temporal, Google Gemini, and Veo 2.
☆16Jul 11, 2025Updated 7 months ago
jeremyarancio / VLM-Batch-Deployment
View on GitHub
Batch Deployment for Document Parsing with AWS Batch & Qwen-2.5-VL
☆49Apr 28, 2025Updated 10 months ago
lilab-stanford / ELF
View on GitHub
Ensemble Learning of Foundation Models
☆17Aug 29, 2025Updated 6 months ago
imanoop7 / Data-Extraction-Tools-Notebooks
View on GitHub
I will be adding different kind of opensource data extraction tools code using python
☆10Nov 15, 2024Updated last year
NKU-MetautoAI / awesome-large-vision-language-models
View on GitHub
Advances in recent large vision language models (LVLMs)
☆15Sep 23, 2024Updated last year
Agora-Lab-AI / Atom
View on GitHub
a suite of finetuned LLMs for atomically precise function calling 🧪
☆17Feb 6, 2026Updated 3 weeks ago
Azure-Samples / retail-search-with-ai
View on GitHub
Retail Search with AI
☆14Feb 14, 2026Updated 2 weeks ago
xtong-zhang / Chain-of-Focus
View on GitHub
☆61Dec 5, 2025Updated 2 months ago
FudanNLPLAB / MouSi
View on GitHub
☆75Mar 7, 2024Updated last year
zzzhhzzz / Ground-R1
View on GitHub
☆38Jul 14, 2025Updated 7 months ago
StarsfieldAI / R1-V
View on GitHub
Witness the aha moment of VLM with less than $3.
☆4,036May 19, 2025Updated 9 months ago
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆62Feb 21, 2025Updated last year
ashishpatel26 / ai-tutor-rag-system
View on GitHub
This is a repository for the course "From Beginner to LLM Developer" by Towards AI.
☆12Jan 2, 2025Updated last year
Azure-Samples / azure-data-manager-for-energy-openai-demo
View on GitHub
This is an end-to-end demo showing the power of LLM on top of Azure Data Manager for Energy data
☆13Apr 3, 2024Updated last year
Artessay / ArtSearch
View on GitHub
A local search system implementation using Elasticsearch for Wikipedia data indexing and retrieval.
☆12May 17, 2025Updated 9 months ago
anair13 / bullet-manipulation-affordances
View on GitHub
☆13Jun 3, 2022Updated 3 years ago
guanjinquan / CXRTrek
View on GitHub
Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning, release the dataset and the model weight
☆13May 26, 2025Updated 9 months ago
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
stanford-iris-lab / batch-exploration
View on GitHub
☆12Apr 25, 2022Updated 3 years ago