atfortes / Awesome-Controllable-Diffusion
Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, IP-Adapter.
β441Updated 5 months ago
Alternatives and similar repositories for Awesome-Controllable-Diffusion:
Users that are interested in Awesome-Controllable-Diffusion are comparing it to the libraries listed below
- π₯π₯π₯ A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).β438Updated last week
- π This is a repository for organizing papers, codes and other resources related to unified multimodal models.β394Updated last month
- [ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creationβ424Updated 3 months ago
- π Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".β448Updated last year
- Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imagβ¦β504Updated 10 months ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.β332Updated 2 months ago
- Official implementation of SEED-LLaMA (ICLR 2024).β600Updated 5 months ago
- Research Trends in LLM-guided Multimodal Learning.β357Updated last year
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Contentβ567Updated 5 months ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agentsβ308Updated 10 months ago
- Recent LLM-based CV and related works. Welcome to comment/contribute!β858Updated last week
- β314Updated last year
- π A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).β598Updated 2 months ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuningβ272Updated last year
- Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Modβ¦β300Updated last month
- Aligning LMMs with Factually Augmented RLHFβ352Updated last year
- Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.β605Updated this week
- A list of works on evaluation of visual generation models, including evaluation metrics, models, and systemsβ257Updated 2 weeks ago
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought β¦β269Updated 2 months ago
- β¨β¨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysisβ480Updated 3 months ago
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Surveyβ383Updated last month
- β¨β¨Woodpecker: Hallucination Correction for Multimodal Large Language Modelsβ633Updated 2 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKUβ345Updated last year
- Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR2024]β208Updated this week
- MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)β289Updated last month
- β414Updated 5 months ago
- β164Updated 8 months ago
- This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation β¦β439Updated 4 months ago