HaroldChen19/VistaDPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HaroldChen19/VistaDPO)

HaroldChen19 / VistaDPO

[ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

☆42

Alternatives and similar repositories for VistaDPO

Users that are interested in VistaDPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EnVision-Research / ScalingAR
View on GitHub
[ICML 2026] ScalingAR: Scaling Confidence for Autoregressive Image Generation
☆22May 5, 2026Updated 2 months ago
GeekGuru123 / ProfilingDiT
View on GitHub
☆20Jan 1, 2026Updated 6 months ago
KyleHuang9 / SeFAR
View on GitHub
[AAAI 2025] SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization
☆30Jan 3, 2025Updated last year
LAW1223 / AlignVid
View on GitHub
☆24May 29, 2026Updated 2 months ago
DuNGEOnmassster / VideoGen-of-Thought
View on GitHub
[Neurips 2025 NextVid Workshop Oral✨] Official Implementation of VideoGen-of-Thought: Step-by-step generating multi-shot video with minim…
☆63Sep 22, 2025Updated 10 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
XianfengWu01 / LightGen
View on GitHub
An Efficient Text-to-Image Generation Pretrain Pipeline
☆132Apr 18, 2025Updated last year
shuzhangzhong / HybriMoE-Preview
View on GitHub
☆17Apr 9, 2025Updated last year
Ziyang412 / Video-RTS
View on GitHub
Code for EMNLP25 paper "Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning"
☆24Feb 18, 2026Updated 5 months ago
EnVision-Research / TiViBench
View on GitHub
[CVPR 2026] TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models
☆67Feb 21, 2026Updated 5 months ago
DuNGEOnmassster / awesome-customized-generative-AI
View on GitHub
Papers and codes collection for customized, personalized and editable generative models
☆28Oct 1, 2024Updated last year
G-U-N / Diffusion-NPO
View on GitHub
[ICLR 2025, AAAI 2026] official implementation of "Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generati…
☆39Jan 26, 2026Updated 6 months ago
EnVision-Research / LatentMorph
View on GitHub
[ICML 2026] LatentMorph: Morphing Latent Reasoning into Image Generation
☆47May 5, 2026Updated 2 months ago
InternRobotics / EgoThinker
View on GitHub
Official implementation of EgoThinker at NIPS 2025
☆29Nov 25, 2025Updated 8 months ago
PKU-YuanGroup / Next-Patch-Prediction
View on GitHub
[AAAI26] Next Patch Prediction
☆129Jan 2, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
hustvl / 4DLangVGGT
View on GitHub
Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”
☆91Mar 25, 2026Updated 4 months ago
EnVision-Research / A4-Agent
View on GitHub
A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning (ECCV 2026)
☆41Jun 29, 2026Updated last month
laulampaul / text-animator
View on GitHub
☆20Jun 26, 2024Updated 2 years ago
snap-research / VIMI
View on GitHub
☆13Jul 10, 2024Updated 2 years ago
tsunghan-wu / reverse_vlm
View on GitHub
🔥 [NeurIPS 2025] Official implementation of "Generate, but Verify: Reducing Visual Hallucination in Vision-Language Models with Retrospe…
☆58Jan 22, 2026Updated 6 months ago
UKPLab / arxiv2025-inherent-limits-plms
View on GitHub
Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…
☆14Jan 16, 2025Updated last year
EnVision-Research / PAP
View on GitHub
Panoramic Affordance Prediction (PAP) (ECCV 2026)
☆46Jun 29, 2026Updated last month
JethroJames / TUNED
View on GitHub
[AAAI 2025] Trusted Unified Feature-Neighborhood Dynamics for Multi-View Classification
☆20Apr 17, 2025Updated last year
jylins / hourllava
View on GitHub
[NeurIPS 2025 Spotlight] Unleashing Hour-Scale Video Training for Long Video-Language Understanding
☆19Jun 24, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
EnVision-Research / PhysToolBench
View on GitHub
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs
☆30Jul 20, 2026Updated last week
CIntellifusion / VideoDPO
View on GitHub
Official Implementation of VideoDPO
☆169Jun 1, 2025Updated last year
EsmaeilNarimissa / aws-sft-grpo-budget-llm-finetune
View on GitHub
☆19May 17, 2025Updated last year
RoboVIP / RoboVIP_VDM
View on GitHub
RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation
☆30Apr 3, 2026Updated 3 months ago
JaaackHongggg / WorldSense
View on GitHub
WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
☆50Jul 12, 2026Updated 2 weeks ago
MCG-NJU / RGE
View on GitHub
Reasoning Guided Embeddings: Leveraging MLLM Reasoning for Improved Multimodal Retrieval
☆15Nov 29, 2025Updated 8 months ago
jialuli-luka / Video-MSG
View on GitHub
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
☆28Apr 14, 2025Updated last year
LanDiff / LanDiff
View on GitHub
The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation
☆41May 4, 2025Updated last year
mbzuai-oryx / Video-R2
View on GitHub
Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models
☆19Jan 21, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
shenao-zhang / reward-augmented-preference
View on GitHub
The official implementation of Preference Data Reward-Augmentation.
☆18May 1, 2025Updated last year
longmalongma / TW-GRPO
View on GitHub
The official repository of our paper "Reinforcing Video Reasoning with Focused Thinking"
☆36Jun 12, 2025Updated last year
HL-hanlin / Bifrost-1
View on GitHub
Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)
☆47Nov 24, 2025Updated 8 months ago
AhmedZgaren / Save
View on GitHub
☆33Oct 2, 2025Updated 9 months ago
TencentARC / SEED-Bench-R1
View on GitHub
☆100Jun 23, 2025Updated last year
AV-Reasoner / AV-Reasoner
View on GitHub
☆19Jul 22, 2025Updated last year
SihengLi99 / RePO
View on GitHub
RePO: Replay-Enhanced Policy Optimization
☆24Jun 12, 2025Updated last year