anpwu / ZJU-CS-ClassNotesLinks
☆21Updated 3 years ago
Alternatives and similar repositories for ZJU-CS-ClassNotes
Users that are interested in ZJU-CS-ClassNotes are comparing it to the libraries listed below
Sorting:
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆348Updated last month
- 【COLING 2025🔥】Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?".☆38Updated last year
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆149Updated last year
- A collection of vision foundation models unifying understanding and generation.☆59Updated last year
- [ACMMM 2025 - Dataset Track] ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies☆22Updated 7 months ago
- [ICML 2025] DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization☆20Updated 8 months ago
- The code for Fine-grained HBOE | AAAI 2024 (official version and optimized version).☆16Updated last year
- This is a collection of recent papers on reasoning in video generation models.☆95Updated last month
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆433Updated 6 months ago
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆259Updated this week
- A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating…☆128Updated last month
- Video Generation Benchmark☆68Updated 8 months ago
- Physical laws underpin all existence, and harnessing them for generative modeling opens boundless possibilities for advancing science and…☆267Updated last month
- Chat about anything on any video!☆38Updated 2 years ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆183Updated 3 months ago
- Watch for idle GPUs and run your jobs: launches jobs in tmux, keeps logs/status and sends start/finish emails..☆81Updated 4 months ago
- MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning☆138Updated 3 months ago
- ☆38Updated 7 months ago
- [ICLR 2026] Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆207Updated last week
- A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.☆25Updated 8 months ago
- [ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…☆361Updated this week
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆179Updated last week
- Official implementation of "Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance"☆374Updated 2 weeks ago
- The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".☆122Updated last year
- Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs☆60Updated last month
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆207Updated 3 months ago
- A curated list of awesome autoregressive papers in Generative AI☆142Updated 4 months ago
- [NeurIPS 2024] DEMO: Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning☆47Updated last year
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆103Updated 7 months ago
- BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models☆40Updated 3 months ago