CYWang735/AdaTooler-V

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CYWang735/AdaTooler-V)

CYWang735 / AdaTooler-V

☆72

Alternatives and similar repositories for AdaTooler-V

Users that are interested in AdaTooler-V are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆98Jul 27, 2025Updated last year
GAIR-NLP / Med
View on GitHub
[ICML 2026] What Does Vision Tool-Use Reinforcement Learning Really Learn? Disentangling Tool-Induced and Intrinsic Effects for Crop-and-…
☆23May 15, 2026Updated 2 months ago
EvolvingLMMs-Lab / ParaVT
View on GitHub
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
☆54Jun 2, 2026Updated last month
zsgvivo / VideoZoomer
View on GitHub
☆34Feb 12, 2026Updated 5 months ago
Koreyoshi01 / VISD
View on GitHub
This repository is the official implementation for VISD.
☆22May 17, 2026Updated 2 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
agents-x-project / PyVision-RL
View on GitHub
[ICML 2026] Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."
☆70Feb 25, 2026Updated 5 months ago
MCG-NJU / Video-o3
View on GitHub
[ICML 2026] Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning
☆135Jul 2, 2026Updated 3 weeks ago
ByteDance-BandAI / CodeVision
View on GitHub
[CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images
☆71Jan 23, 2026Updated 6 months ago
fansunqi / VideoTool
View on GitHub
Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"
☆23May 18, 2026Updated 2 months ago
marinero4972 / Open-o3-Video
View on GitHub
[ICML 2026] Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"
☆157May 1, 2026Updated 2 months ago
wanglu-cs / Think_While_Watching
View on GitHub
☆19Jun 26, 2026Updated last month
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
UCSB-AI / DMLR
View on GitHub
[CVPR2026] Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"
☆85May 12, 2026Updated 2 months ago
Ziyang412 / Video-RTS
View on GitHub
Code for EMNLP25 paper "Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning"
☆24Feb 18, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,497Mar 9, 2026Updated 4 months ago
QiWang98 / VideoRFT
View on GitHub
[NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
☆64Jan 6, 2026Updated 6 months ago
tulerfeng / OneThinker
View on GitHub
🔥 OneThinker: All-in-one Reasoning Model for Image and Video [CVPR 2026]
☆463Feb 28, 2026Updated 5 months ago
LaVi-Lab / Rethink_CoT_Video
View on GitHub
Official code for "Rethinking Chain-of-Thought Reasoning for Videos"
☆21Dec 14, 2025Updated 7 months ago
yfzhang114 / Thyme
View on GitHub
✨✨ [ICLR 2026] Think Beyond Images
☆584Sep 23, 2025Updated 10 months ago
zlab-princeton / vero
View on GitHub
Vero: An Open RL Recipe for General Visual Reasoning
☆137Jun 19, 2026Updated last month
zhang9302002 / ThinkingWithVideos
View on GitHub
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆102Oct 15, 2025Updated 9 months ago
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
xuyang-liu16 / hermes-code-bridge
View on GitHub
Use Hermes Agent as the control plane for local coding agents like Codex, Kimi Code, Claude Code, OpenCode, and Gemini CLI.
☆25Jul 22, 2026Updated last week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
hwanyu112 / Latent-Sketchpad
View on GitHub
☆73Feb 1, 2026Updated 5 months ago
Mini-o3 / Mini-o3
View on GitHub
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
☆423Jan 29, 2026Updated 6 months ago
VisionOPD / Vision-OPD
View on GitHub
Vision-OPD is a regional-to-global on-policy self-distillation framework that transfers a model's own privileged crop-conditioned percept…
☆214Jul 17, 2026Updated last week
Accio-Lab / SwimBird
View on GitHub
☆18Apr 9, 2026Updated 3 months ago
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆153May 1, 2026Updated 2 months ago
uclanlp / OpenVLThinker
View on GitHub
OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement
☆155May 25, 2026Updated 2 months ago
ZiyuGuo99 / ATLAS
View on GitHub
One Discrete Word for Visual Reasoning Overtakes Agentic and Latent Methods
☆137Jun 9, 2026Updated last month
LJungang / Awesome-Video-Reasoning-Landscape
View on GitHub
🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.
☆190Jun 14, 2026Updated last month
tulerfeng / Video-R1
View on GitHub
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
☆884Dec 14, 2025Updated 7 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
shawn0728 / OpenSearch-VL
View on GitHub
🔍 OpenSearch-VL provides a fully open recipe for training strong multimodal deep search agents through high-quality data curation, diver…
☆258May 19, 2026Updated 2 months ago
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆122Nov 4, 2025Updated 8 months ago
waltonfuture / RL-with-Cold-Start
View on GitHub
SFT+RL boosts multimodal reasoning
☆47Jun 27, 2025Updated last year
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 5 months ago
OpenGVLab / VKnowU
View on GitHub
[ECCV 2026] VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs
☆16Feb 3, 2026Updated 5 months ago
Visual-Agent / DeepEyesV2
View on GitHub
☆625Feb 26, 2026Updated 5 months ago
KD-TAO / OmniAgent
View on GitHub
OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding
☆22Apr 9, 2026Updated 3 months ago