JIA-Lab-research/VisionThink

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JIA-Lab-research/VisionThink)

JIA-Lab-research / VisionThink

[NeurIPS 2025] Efficient Reasoning Vision Language Models

☆459

Alternatives and similar repositories for VisionThink

Users that are interested in VisionThink are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JIA-Lab-research / VisionZip
View on GitHub
Official repository for VisionZip (CVPR 2025)
☆443Jul 21, 2025Updated last year
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆422Jan 29, 2026Updated 5 months ago
NVlabs / Long-RL
View on GitHub
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆726Sep 24, 2025Updated 9 months ago
JIA-Lab-research / Lyra
View on GitHub
[ICCV 2025] Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"
☆307Jan 9, 2025Updated last year
Yangsenqiao / ULDA
View on GitHub
Unified Language-driven Zero-shot Domain Adaptation (CVPR 2024)
☆17Nov 28, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
JIA-Lab-research / LSDBench
View on GitHub
A benchmark that focuses on the sampling dilemma in long-video tasks. Through well-designed tasks, it evaluates the sampling efficiency o…
☆28Aug 7, 2025Updated 11 months ago
Visual-Agent / DeepEyes
View on GitHub
☆1,249Nov 20, 2025Updated 8 months ago
yu-lin-li / ReBalance
View on GitHub
[ICLR 2026] Efficient Reasoning with Balanced Thinking
☆131May 30, 2026Updated last month
yfzhang114 / Thyme
View on GitHub
✨✨ [ICLR 2026] Think Beyond Images
☆582Sep 23, 2025Updated 9 months ago
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
Zefan-Cai / R-KV
View on GitHub
[Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
☆1,203Updated this week
ByteDance-Seed / EvaLearn
View on GitHub
EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in chall…
☆431May 12, 2026Updated 2 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,492Mar 9, 2026Updated 4 months ago
Danielement321 / HiPrune
View on GitHub
[ACL-2026 Findings] Implementation for HiPrune, a training-free visual token pruning method for VLM acceleration.
☆58Apr 29, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ali-vilab / TTS-VAR
View on GitHub
Test-time Scaling for VAR models
☆32Sep 19, 2025Updated 10 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,071Updated this week
tulerfeng / Video-R1
View on GitHub
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
☆879Dec 14, 2025Updated 7 months ago
EvolvingLMMs-Lab / lmms-eval
View on GitHub
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
☆4,320Updated this week
HKUDS / SepLLM
View on GitHub
[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"
☆571Jul 29, 2025Updated 11 months ago
marinero4972 / Open-o3-Video
View on GitHub
[ICML 2026] Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"
☆157May 1, 2026Updated 2 months ago
ByteDance-Seed / Seed1.5-VL
View on GitHub
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…
☆1,582Jun 14, 2025Updated last year
EvolvingLMMs-Lab / NEO
View on GitHub
NEO Series: Native Vision-Language Models from First Principles
☆868Jul 1, 2026Updated 2 weeks ago
Theia-4869 / VisPruner
View on GitHub
[ICCV 2025] Official code for paper: Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
☆84Jul 1, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
suimuc / VIRES
View on GitHub
☆342Jul 4, 2025Updated last year
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
Theia-4869 / CDPruner
View on GitHub
[NeurIPS 2025] Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.
☆105Sep 20, 2025Updated 10 months ago
Hunyuan-PromptEnhancer / PromptEnhancer
View on GitHub
[CVPR 2026] PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.
☆3,729Jun 10, 2026Updated last month
EvolvingLMMs-Lab / LLaVA-OneVision-2
View on GitHub
Fully Open Framework for Democratized Multimodal Training
☆1,143Updated this week
NVlabs / QeRL
View on GitHub
[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.
☆511Mar 30, 2026Updated 3 months ago
pat-jj / s3
View on GitHub
[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)
☆842Nov 5, 2025Updated 8 months ago
daixiangzi / Awesome-Token-Compress
View on GitHub
A paper list of some recent works about Token Compress for Vit and VLM
☆939Updated this week
GenerTeam / GENERanno
View on GitHub
GENERanno: A Genomic Foundation Model for Metagenomic Annotation
☆314Jun 15, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
zli12321 / Vision-SR1
View on GitHub
Reinforcement Learning of Vision Language Models with Self Visual Perception Reward
☆175Mar 14, 2026Updated 4 months ago
AdaptVision / AdaptVision
View on GitHub
[CVPR 2026] AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition
☆40Apr 27, 2026Updated 2 months ago
yixinzhang98 / otc_med_chat_agent
View on GitHub
An AI-powered conversational agent for recommending over-the-counter medications based on user symptoms and needs. Built with Python and …
☆198Jul 29, 2025Updated 11 months ago
MikeWangWZHL / PAPO
View on GitHub
Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"
☆151Feb 4, 2026Updated 5 months ago
HJYao00 / Mulberry
View on GitHub
[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS
☆1,244Jan 16, 2026Updated 6 months ago
hyperai / tvm-cn
View on GitHub
TVM Documentation in Chinese Simplified / TVM 中文文档
☆3,854May 20, 2026Updated 2 months ago
jackdark425 / aigroupapp
View on GitHub
AI Group is a powerful mobile intelligent assistant application that integrates multiple large language models (LLMs) and AI services, pr…
☆1,100Sep 10, 2025Updated 10 months ago