AlaaLab / InstructCV
[ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"
☆462Updated 11 months ago
Alternatives and similar repositories for InstructCV:
Users that are interested in InstructCV are comparing it to the libraries listed below
- [CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"☆222Updated 6 months ago
- [IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation☆1,113Updated 5 months ago
- [ICML 2023 Oral, NeurIPS 2023] Official implementations for paper: Customizable Image Synthesis with Multiple Subjects☆434Updated last year
- [CVPR 2025 Highlight] Official code for "Olympus: A Universal Task Router for Computer Vision Tasks"☆345Updated 2 weeks ago
- Efficient DiT architecture for text2any tasks, ICLR2025☆429Updated 2 months ago
- Video generation from text&image, 1st-gen☆882Updated 2 months ago
- ☆126Updated 2 weeks ago
- This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024 Oral☆394Updated 8 months ago
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].☆268Updated 4 months ago
- ☆138Updated 2 months ago
- Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, C…☆288Updated 7 months ago
- ☆91Updated last year
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆104Updated 9 months ago
- Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models☆910Updated last month
- GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation☆495Updated last year
- [NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions☆1,053Updated 6 months ago
- [NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy☆64Updated 3 months ago
- Pytorch Implementation of FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing (ICLR 2024)☆206Updated 11 months ago
- 🔥 🔥 🔥 [NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies☆194Updated last week
- SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree☆463Updated 4 months ago
- Official repository of MMGenBench☆119Updated last month
- Unofficial Implementation of ReplaceAnything: https://aigcdesigngroup.github.io/replace-anything/☆403Updated 10 months ago
- [CVPR'25]Tora: Trajectory-oriented Diffusion Transformer for Video Generation☆1,129Updated last month
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆1,171Updated 3 weeks ago
- 📌 [Arxiv2025] Official implementation of "NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representation"☆166Updated 3 weeks ago
- Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥☆566Updated 3 months ago
- ☆289Updated 9 months ago
- A post-training method to enhance CLIP's fine-grained visual representations with generative models.☆48Updated 3 weeks ago
- The official PyTorch implementation of Diffusion Time-step Curriculum for One Image to 3D Generation (CVPR 2024)☆76Updated 10 months ago
- An unofficial implementation of the paper "TopNet: Transformer-based Object Placement Network for Image Compositing", CVPR 2023.☆23Updated 3 months ago