jialuli-luka/Video-MSG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jialuli-luka/Video-MSG)

jialuli-luka / Video-MSG

Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization

☆28

Alternatives and similar repositories for Video-MSG

Users that are interested in Video-MSG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yui010206 / MEXA
View on GitHub
[EMNLP 2025 Findings] MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation
☆15Aug 22, 2025Updated 11 months ago
wz0919 / EPiC
View on GitHub
[ICML2026] Official implementation of EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance
☆50Jun 2, 2025Updated last year
Yui010206 / Adaptive-Visual-Imagination-Control
View on GitHub
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
☆18Jun 2, 2026Updated last month
Yui010206 / VEGGIE-VidEdit
View on GitHub
[ICCV2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation
☆34Aug 18, 2025Updated 11 months ago
Ziyang412 / Video-RTS
View on GitHub
Code for EMNLP25 paper "Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning"
☆24Feb 18, 2026Updated 5 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
Yui010206 / Ego2Web
View on GitHub
[CVPR 2026] Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos
☆29Mar 25, 2026Updated 4 months ago
black-yt / ReaLS
View on GitHub
Exploring Representation-Aligned Latent Space for Better Generation
☆19Mar 17, 2026Updated 4 months ago
CrystalSixone / VLN-MAGIC
View on GitHub
This is the official repository for MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation Learning towards Efficient Vision-and-La…
☆17May 17, 2026Updated 2 months ago
daeunni / StreamGaze
View on GitHub
Code for "StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos"
☆27May 13, 2026Updated 2 months ago
FreedomIntelligence / MedGen
View on GitHub
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.
☆33Apr 18, 2026Updated 3 months ago
iSEE-Laboratory / VLN-PRET
View on GitHub
☆23Oct 19, 2024Updated last year
YicongHong / Ego2Map-NaViT
View on GitHub
Official Implementation of Learning Navigational Visual Representations with Semantic Map Supervision (ICCV2023)
☆28Jul 30, 2023Updated 2 years ago
HAWLYQ / ET-Cap
View on GitHub
☆24Oct 8, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
deepshwang / crepa
View on GitHub
☆15Jun 21, 2025Updated last year
Madaoer / VLIPP
View on GitHub
[ICCV 2025] Pytorch implementation of "VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Pr…
☆55Jul 28, 2025Updated 11 months ago
GeekGuru123 / ProfilingDiT
View on GitHub
☆20Jan 1, 2026Updated 6 months ago
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 10 months ago
wz0919 / VLN-SRDF
View on GitHub
Official implementation of: Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
☆35Jun 10, 2025Updated last year
zhaoc5 / Grounding-REVERIE-Challenge
View on GitHub
Official REVERIE Grounding Model of REVERIE Challenge @ CSIG 2022
☆19Oct 17, 2022Updated 3 years ago
jialuli-luka / SELMA
View on GitHub
Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data
☆35Mar 12, 2024Updated 2 years ago
daeunni / VideoRepair
View on GitHub
Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement [ACL 2026 Findings]"
☆52Apr 7, 2026Updated 3 months ago
LanDiff / LanDiff
View on GitHub
The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation
☆41May 4, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
jaehong31 / SAFREE
View on GitHub
[ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation
☆59Jan 22, 2025Updated last year
Eyeline-Labs / VChain
View on GitHub
[ACL 2026 Findings, ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation
☆120Apr 8, 2026Updated 3 months ago
johndpope / DiPIR-hack
View on GitHub
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering (nvidia)
☆16Sep 24, 2024Updated last year
Vchitect / RealDPO
View on GitHub
☆32Dec 17, 2025Updated 7 months ago
SAIS-FUXI / IPO
View on GitHub
☆58May 6, 2025Updated last year
Phantom-video / LibraGen
View on GitHub
☆17Mar 19, 2026Updated 4 months ago
xizaoqu / MOFT
View on GitHub
[Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller
☆51Aug 5, 2025Updated 11 months ago
Lliar-liar / Daily-Omni
View on GitHub
This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities
☆42Apr 28, 2026Updated 2 months ago
QUVA-Lab / PIN
View on GitHub
Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
☆26Jan 14, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jacobkrantz / Sim2Sim-VLNCE
View on GitHub
Official implementation of the ECCV 2022 Oral paper: Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments
☆35Dec 16, 2023Updated 2 years ago
Picsart-AI-Research / Social-Reward
View on GitHub
[ICLR 2024 Spotlight] Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Communi…
☆12Mar 29, 2024Updated 2 years ago
ByteDance-Seed / VINCIE
View on GitHub
Official code for VINCIE: Unlocking In-context Image Editing from Video
☆60Jun 19, 2026Updated last month
ali-vilab / iv-vae
View on GitHub
☆34Mar 4, 2025Updated last year
Westlake-AGI-Lab / SwitchCraft
View on GitHub
Official Implementation of SwitchCraft: Training-Free Multi-Event Video Generation with Attention Controls [CVPR 2026]
☆24Mar 2, 2026Updated 4 months ago
wz0919 / DreamRunner
View on GitHub
[AAAI 2026] Official implementation of DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation
☆78Jun 11, 2025Updated last year
TonyLianLong / LLM-groundedVideoDiffusion
View on GitHub
[ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper
☆172May 7, 2024Updated 2 years ago