[CVPR 2026] ViStoryBench: AI Story Visualization Benchmark
☆137Mar 4, 2026Updated this week
Alternatives and similar repositories for vistorybench
Users that are interested in vistorybench are comparing it to the libraries listed below
Sorting:
- Omni Controllable Video Diffusion☆41Dec 22, 2025Updated 2 months ago
- iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation☆185Dec 1, 2025Updated 3 months ago
- [NeurIPS 2025 DB] OneIG-Bench is a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models acro…☆108Feb 10, 2026Updated 3 weeks ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 4 months ago
- ☆453Aug 10, 2025Updated 6 months ago
- This is the official code repository for the paper: Towards General Continuous Memory for Vision-Language Models.☆21Jul 3, 2025Updated 8 months ago
- ComfyUI version of WithAnyone☆23Dec 18, 2025Updated 2 months ago
- DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models☆170Jan 4, 2026Updated 2 months ago
- YOLO-TLP: detected and classified tiny objects with bounding box dimensions smaller than 15 pixels, outperforming other one-stage detecto…☆21Oct 6, 2025Updated 5 months ago
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- Controllable Animation Video Generation with Large Models-based Multimodal Agents☆233Jan 7, 2026Updated last month
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆59May 26, 2025Updated 9 months ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆21Oct 8, 2024Updated last year
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)☆19Jul 1, 2025Updated 8 months ago
- A Comprehensive Dataset for Advanced Image Generation and Editing}☆31Oct 2, 2025Updated 5 months ago
- Code for FreeTraj, a tuning-free method for trajectory-controllable video generation☆111Sep 19, 2025Updated 5 months ago
- Official repo for StyleMe3D☆28Apr 22, 2025Updated 10 months ago
- Visual Spatial Tuning☆176Feb 19, 2026Updated 2 weeks ago
- This project is the official implementation of 'DreamOmni3: Scribble-based Editing and Generation''☆38Dec 30, 2025Updated 2 months ago
- this is for fun, ain't it grand!☆21Sep 18, 2025Updated 5 months ago
- Real time streaming digital human based on nerf☆19May 20, 2024Updated last year
- ☆33Jul 15, 2025Updated 7 months ago
- Exploring Feature Self-relation for Self-supervised Transformer (TPAMI 2023)☆21Apr 30, 2025Updated 10 months ago
- This project is the official implementation of "UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Gener…☆208Jan 29, 2026Updated last month
- [CVPR 2026] 🔥🔥 Official Repo of UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward☆179Sep 15, 2025Updated 5 months ago
- ☆21Aug 30, 2025Updated 6 months ago
- Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Rou…☆34Sep 25, 2025Updated 5 months ago
- Large language models designed for formal theorem proving through tool-integrated reasoning.☆33Aug 13, 2025Updated 6 months ago
- ☆55Jan 30, 2026Updated last month
- Official Code for 'AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction' (ICCV 2025)☆62Nov 8, 2025Updated 3 months ago
- An interactive thinking and deep reasoning model. It provides a cognitive reasoning paradigm for complex multi-hop problems.☆79Nov 14, 2025Updated 3 months ago
- ☆53Dec 10, 2025Updated 2 months ago
- [ICCV 2025, Highlight] Official Pytorch implementation of the paper: "ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mi…☆36Aug 1, 2025Updated 7 months ago
- ☆34Oct 9, 2025Updated 4 months ago
- [NeurIPS 2024] COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing☆25Dec 8, 2024Updated last year
- Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆37Nov 27, 2025Updated 3 months ago
- Code release for "CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning", ICLR 2025☆29Apr 21, 2025Updated 10 months ago
- ☆27Jun 4, 2024Updated last year
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆26Oct 17, 2024Updated last year