Dataset splits and evaluation code for the paper "Benchmark for Compositional Text-to-Image Synthesis" (NeurIPS 2021)
☆45May 3, 2022Updated 3 years ago
Alternatives and similar repositories for comp-t2i-dataset
Users that are interested in comp-t2i-dataset are comparing it to the libraries listed below
Sorting:
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Nov 29, 2023Updated 2 years ago
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Jun 15, 2023Updated 2 years ago
- Official This-Is-My Dataset published in CVPR 2023☆16Jul 18, 2024Updated last year
- Official code repository for the paper: "TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision"☆44Jun 9, 2023Updated 2 years ago
- ☆15Oct 24, 2024Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- ☆31Mar 24, 2022Updated 3 years ago
- A collection of resources on generation.☆13Oct 9, 2022Updated 3 years ago
- source code for Stable Diffusion with Perp-Neg☆195Aug 25, 2023Updated 2 years ago
- How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?☆13Aug 16, 2023Updated 2 years ago
- ☆19Aug 6, 2024Updated last year
- ☆19Jan 30, 2023Updated 3 years ago
- Awesome-AI4Earth: a curated list of machine learning in Earth System, especially for weather and climate.☆13Dec 27, 2023Updated 2 years ago
- ☆13Jul 20, 2024Updated last year
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Jan 18, 2024Updated 2 years ago
- End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021☆18Oct 24, 2021Updated 4 years ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆16Apr 23, 2024Updated last year
- Official implementation of the paper The Hidden Language of Diffusion Models☆77Jan 24, 2024Updated 2 years ago
- ☆15Sep 25, 2021Updated 4 years ago
- [Preprint'23] "Efficient Meshy Neural Fields for Animatable Human Avatars" https://arxiv.org/abs/2303.12965☆25Sep 30, 2024Updated last year
- Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time☆46Jun 11, 2024Updated last year
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆22Nov 8, 2023Updated 2 years ago
- Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation (CVPR 2023)☆523Mar 13, 2024Updated last year
- Code for paper LAFITE: Towards Language-Free Training for Text-to-Image Generation (CVPR 2022)☆183Mar 23, 2023Updated 2 years ago
- Official implementation of Aurora☆85Sep 20, 2023Updated 2 years ago
- Educational repository for applying the main video data curation techniques presented in the Stable Video Diffusion paper.☆81Dec 30, 2023Updated 2 years ago
- (ICCV 2023) official repository for "Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation"☆777May 29, 2024Updated last year
- A light-weight data management system for large-scale pretraining☆21May 17, 2025Updated 9 months ago
- Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"☆27Jul 10, 2023Updated 2 years ago
- ☆54Jul 31, 2022Updated 3 years ago
- Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]☆137Sep 29, 2024Updated last year
- collection of pitch (f0, fundamental frequency) detection algorithms with unified interface☆25Nov 25, 2024Updated last year
- Unofficial implementation of 2D ProlificDreamer☆145Jan 6, 2025Updated last year
- [AAAI 2024] ConceptBed Evaluations for Personalized Text-to-Image Diffusion Models☆25Jun 1, 2023Updated 2 years ago
- This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…☆29Jul 9, 2025Updated 8 months ago
- A pytorch implementation of “X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Gen…☆74May 11, 2024Updated last year
- TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering☆181Apr 29, 2024Updated last year
- Official PyTorch implementation of Vision DiffMask, a post-hoc interpretation method for vision models.☆32Mar 5, 2024Updated 2 years ago
- AQUA dataset and VIKING model for the task of Art Visual Question Answering☆27Jun 4, 2021Updated 4 years ago