anpwu / ZJU-CS-ClassNotesLinks
☆21Updated 3 years ago
Alternatives and similar repositories for ZJU-CS-ClassNotes
Users that are interested in ZJU-CS-ClassNotes are comparing it to the libraries listed below
Sorting:
- A paper list for spatial reasoning☆143Updated 4 months ago
- A collection of vision foundation models unifying understanding and generation.☆56Updated 9 months ago
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆127Updated 11 months ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆131Updated last week
- ☆28Updated 7 months ago
- The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".☆118Updated last year
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆313Updated last week
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆183Updated last week
- The code for Fine-grained HBOE | AAAI 2024 (official version and optimized version).☆16Updated last year
- BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models☆34Updated last month
- [ICML 2025] DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization☆17Updated 4 months ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆190Updated 5 months ago
- Collection of the latest spatial, 3D, and video/temporal reasoning papers☆22Updated 2 weeks ago
- A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating…☆121Updated 2 weeks ago
- A curated list of awesome autoregressive papers in Generative AI☆119Updated 3 weeks ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆174Updated 2 months ago
- A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.☆24Updated 4 months ago
- [ICLR2025] The code of Z-Sampling, proposed in our paper "Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflectio…☆92Updated 8 months ago
- A list of works on video generation towards world model☆167Updated 2 months ago
- Chat about anything on any video!☆36Updated 2 years ago
- Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unifie…☆284Updated this week
- [ACMMM 2025 - Dataset Track] ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies☆19Updated 3 months ago
- A preview-version of one novel multimodal reasoning benchmark CharmBench.☆23Updated 2 months ago
- [NeurIPS 2025] VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models☆80Updated this week
- ☆50Updated last month
- ☆21Updated last year
- ☆52Updated last month
- An easy way for debug python for Slurm HPC users.☆26Updated 6 months ago
- [NeurIPS 2024] DEMO: Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning☆47Updated 11 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆156Updated 3 weeks ago