Deaddawn / MovieLLM-code
☆166Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for MovieLLM-code
- ☆145Updated 2 months ago
- Multimodal Models in Real World☆403Updated 3 weeks ago
- ☆173Updated 3 months ago
- ☆165Updated 4 months ago
- ☆141Updated 4 months ago
- This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"☆130Updated 3 months ago
- A Training-free Iterative Framework for Long Story Visualization☆61Updated this week
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆127Updated 5 months ago
- UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization☆202Updated last month
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆134Updated 3 weeks ago
- 🔥🔥First-ever hour scale video understanding models☆166Updated 3 weeks ago
- A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.☆99Updated last month
- An initiative to replicate Sora☆99Updated 7 months ago
- ☆254Updated 3 months ago
- UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing☆91Updated 2 weeks ago
- 🔥 CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models☆195Updated 4 months ago
- VCoder: Versatile Vision Encoders for Multimodal Large Language Models, arXiv 2023 / CVPR 2024☆261Updated 7 months ago
- ☆278Updated 2 weeks ago
- An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community …☆55Updated this week
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆94Updated this week
- Code repository for T2V-Turbo and T2V-Turbo-v2☆250Updated last month
- ☆356Updated 5 months ago
- SCEPTER is an open-source framework used for training, fine-tuning, and inference with generative models.☆428Updated 2 weeks ago
- [NeurIPS 2024] VideoTetris: Towards Compositional Text-To-Video Generation☆206Updated 2 weeks ago
- MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation☆191Updated 4 months ago
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆42Updated 10 months ago
- Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model☆234Updated 3 months ago
- I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models☆201Updated 10 months ago
- [ICLR 2024] Code for FreeNoise based on VideoCrafter☆386Updated 4 months ago
- ☆217Updated 7 months ago