AlaaLab / InstructCV
[ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"
☆460Updated 10 months ago
Alternatives and similar repositories for InstructCV:
Users that are interested in InstructCV are comparing it to the libraries listed below
- [CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"☆215Updated 5 months ago
- [IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation☆1,021Updated 3 months ago
- Efficient DiT architecture for text2any tasks, ICLR2025☆378Updated 2 weeks ago
- [NeurIPS 2023] Official implementations for paper: Customizable Image Synthesis with Multiple Subjects☆428Updated last year
- Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, C…☆271Updated 5 months ago
- Video generation from text&image, 1st-gen☆791Updated 2 weeks ago
- [NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions☆1,041Updated 4 months ago
- ☆91Updated 11 months ago
- GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation☆495Updated last year
- 🔥 🔥 🔥 [NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies☆172Updated last week
- Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models☆888Updated last month
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆98Updated 8 months ago
- [NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy☆61Updated last month
- This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024 Oral☆389Updated 6 months ago
- Official repository of MMGenBench☆119Updated 3 months ago
- Real-time and accurate open-vocabulary end-to-end object detection☆1,294Updated 2 months ago
- Unofficial Implementation of ReplaceAnything: https://aigcdesigngroup.github.io/replace-anything/☆403Updated 9 months ago
- OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]☆1,238Updated 2 months ago
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].☆261Updated 3 months ago
- Pytorch Implementation of FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing (ICLR 2024)☆206Updated 9 months ago
- Uncommon Objects in 3D dataset☆957Updated last month
- Improving Generalist Model with Domain-Specific Experts☆84Updated last month
- ☆158Updated 4 months ago
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆779Updated 2 weeks ago
- [CVPR 2025] The official code for "Olympus: A Universal Task Router for Computer Vision Tasks"☆48Updated this week
- [CVPR2024] Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion☆112Updated 4 months ago
- ☆286Updated 7 months ago
- [ACM MM'2024] Official repository for "Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval"☆36Updated 2 months ago
- The official PyTorch implementation of Diffusion Time-step Curriculum for One Image to 3D Generation (CVPR 2024)☆75Updated 8 months ago