zlab-princeton/vero

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zlab-princeton/vero)

zlab-princeton / vero

Vero: An Open RL Recipe for General Visual Reasoning

☆134

Alternatives and similar repositories for vero

Users that are interested in vero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zlab-princeton / VisionFoundry
View on GitHub
VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images
☆51Apr 28, 2026Updated 2 months ago
zlab-princeton / UEval
View on GitHub
UEval: A Benchmark for Unified Multimodal Generation
☆24Apr 20, 2026Updated 3 months ago
EvolvingLMMs-Lab / OpenMMReasoner
View on GitHub
[CVPR 2026] OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
☆164Mar 30, 2026Updated 3 months ago
EvolvingLMMs-Lab / LLaVA-OneVision-1.5-RL
View on GitHub
Fully Open Framework for Democratized Multimodal Reinforcement Learning.
☆51Dec 19, 2025Updated 7 months ago
EvolvingLMMs-Lab / Evolving-Visual-Generation
View on GitHub
[Roadmap] Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
☆124Jun 9, 2026Updated last month
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
EvolvingLMMs-Lab / OneVision-Encoder
View on GitHub
Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
☆385Jun 20, 2026Updated last month
yangzhou24 / RealGRPO
View on GitHub
A Simple Way to Eliminate Reward Hacking in GRPO Diffusion Alignment
☆21Apr 14, 2026Updated 3 months ago
EvolvingLMMs-Lab / LLaVA-OneVision-2
View on GitHub
Fully Open Framework for Democratized Multimodal Training
☆1,146Updated this week
penghao-wu / visual_jigsaw
View on GitHub
☆78Apr 9, 2026Updated 3 months ago
NOVAglow646 / Monet
View on GitHub
[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
☆207Mar 19, 2026Updated 4 months ago
LiZizun / GeoAlign
View on GitHub
Geo-Align: Video Generation Alignment via Metric Geometry Reward
☆33May 25, 2026Updated last month
CYWang735 / AdaTooler-V
View on GitHub
☆71Feb 27, 2026Updated 4 months ago
tonghe90 / eidetic-memory
View on GitHub
Own eidetic-memory like Sheldon
☆32Apr 8, 2026Updated 3 months ago
ModalMinds / gym-v
View on GitHub
A unified framework for vision-language environments with Gymnasium-compatible interface
☆35Mar 17, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Yangr116 / VST
View on GitHub
[ECCV2026] Visual Spatial Tuning
☆199Mar 25, 2026Updated 3 months ago
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆68Feb 21, 2025Updated last year
tencent-ailab / Penguin-VL
View on GitHub
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders [Technical Report]
☆204Mar 30, 2026Updated 3 months ago
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 4 months ago
XIAO4579 / PRISM
View on GitHub
Beyond SFT-to-RL: Pre-alignment via Black-BoxOn-Policy Distillation for Multimodal RL
☆96May 6, 2026Updated 2 months ago
black-yt / ReaLS
View on GitHub
Exploring Representation-Aligned Latent Space for Better Generation
☆19Mar 17, 2026Updated 4 months ago
mm-vl / ULM-R1
View on GitHub
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
☆48Jul 22, 2025Updated 11 months ago
EvolvingLMMs-Lab / ParaVT
View on GitHub
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
☆54Jun 2, 2026Updated last month
Cooperx521 / ScaleCap
View on GitHub
(ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’
☆60Jan 26, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Visual-Agent / DeepEyes
View on GitHub
☆1,250Nov 20, 2025Updated 8 months ago
tonghe90 / auto-hf-papers
View on GitHub
☆18Mar 25, 2026Updated 3 months ago
Li-Hao-yuan / GeoThinker
View on GitHub
☆68Feb 12, 2026Updated 5 months ago
Tencent / HaploVLM
View on GitHub
ICML2025
☆63Aug 28, 2025Updated 10 months ago
OpenSenseNova / SenseNova-Vision
View on GitHub
Vision as Unified Multimodal Generation
☆448Updated this week
huangrh99 / AlphaGRPO
View on GitHub
[ICML2026] Official Implementation of AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in Unified Multimodal Models via Decompo…
☆73Jul 14, 2026Updated last week
THUMAI-Lab / LLaVA-UHD-v4
View on GitHub
☆46Jun 7, 2026Updated last month
shawn0728 / OpenSearch-VL
View on GitHub
🔍 OpenSearch-VL provides a fully open recipe for training strong multimodal deep search agents through high-quality data curation, diver…
☆254May 19, 2026Updated 2 months ago
zlab-princeton / llm-distillation-jax
View on GitHub
JAX implementation of configurable LLM distillation training
☆24Nov 15, 2025Updated 8 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
tang-bd / v-grpo
View on GitHub
[CVPR 2026 Findings] V-GRPO: Online Reinforcement Learning for Denoising Generative Models Is Easier than You Think
☆56Apr 28, 2026Updated 2 months ago
ls-kelvin / REVPT
View on GitHub
Code for paper: Reinforced Vision Perception with Tools
☆74Oct 3, 2025Updated 9 months ago
Tencent-Hunyuan / UniRL
View on GitHub
UniRL is a Framework for Unified Multimodal Model Reinforcement Learning
☆835Updated this week
NVlabs / Long-RL
View on GitHub
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆726Sep 24, 2025Updated 9 months ago
EvolvingLMMs-Lab / open-r1-multimodal
View on GitHub
A fork to add multimodal model training to open-r1
☆1,591Feb 8, 2025Updated last year
tulerfeng / Video-R1
View on GitHub
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
☆879Dec 14, 2025Updated 7 months ago
EvolvingLMMs-Lab / NEO
View on GitHub
NEO Series: Native Vision-Language Models from First Principles
☆870Jul 1, 2026Updated 2 weeks ago