[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
☆62Apr 27, 2025Updated 10 months ago
Alternatives and similar repositories for Science-T2I
Users that are interested in Science-T2I are comparing it to the libraries listed below
Sorting:
- A Massive Multi-Discipline Lecture Understanding Benchmark☆33Nov 1, 2025Updated 4 months ago
- ☆25Mar 30, 2025Updated 11 months ago
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆79Dec 10, 2024Updated last year
- Official implementation for "Nested Attention: Semantic-aware Attention Values for Concept Personalization" [SIGGRAPH 2025]☆27Aug 4, 2025Updated 7 months ago
- ☆34Dec 29, 2025Updated 2 months ago
- ReNeg: Learning Negative Embedding with Reward Guidance☆35Dec 22, 2025Updated 2 months ago
- BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models☆40Oct 30, 2025Updated 4 months ago
- [ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark☆138Jun 4, 2025Updated 9 months ago
- [ICCV2025] DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation☆203Jun 8, 2025Updated 8 months ago
- ☆22Jan 26, 2026Updated last month
- Evaluation codes and data for GenEval2☆57Jan 8, 2026Updated last month
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Aug 13, 2024Updated last year
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"☆425Jun 20, 2025Updated 8 months ago
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆53May 8, 2025Updated 9 months ago
- MR. Video: MapReduce is the Principle for Long Video Understanding☆31Apr 23, 2025Updated 10 months ago
- Hand Mesh Recovery models on OakInk-Image dataset☆12Apr 4, 2024Updated last year
- ICCV'23 | Adverse Weather Removal with Codebook Priors☆10Aug 28, 2023Updated 2 years ago
- Code for Scaling Language-Free Visual Representation Learning (WebSSL)☆245Apr 24, 2025Updated 10 months ago
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis☆131May 16, 2025Updated 9 months ago
- ☆33Aug 9, 2024Updated last year
- Real-time VIBE: Frame by Frame Inference of VIBE (Video Inference for Human Body Pose and Shape Estimation)☆27Dec 2, 2021Updated 4 years ago
- FlexiFilm: Long Video Generation with Flexible Conditions☆31May 1, 2024Updated last year
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆30Apr 27, 2024Updated last year
- [ICLR26] GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning☆104Jan 27, 2026Updated last month
- UniDisc: A discrete diffusion model for joint multimodal generation, enabling controllable and efficient text-image synthesis, editing, a…☆134Apr 2, 2025Updated 11 months ago
- (NeurIPS 2025 D&B Track) OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps☆25Jan 22, 2026Updated last month
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆234Jan 22, 2026Updated last month
- ☆15Jun 19, 2024Updated last year
- [CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Mo…☆329Jun 8, 2025Updated 8 months ago
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆181Jan 30, 2026Updated last month
- Official Repository of paper: "MotionEdit: Benchmarking and Learning Motion-Centric Image Editing"☆60Updated this week
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆33Jun 30, 2025Updated 8 months ago
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆201Apr 29, 2025Updated 10 months ago
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding☆512Nov 14, 2025Updated 3 months ago
- Create Latents with Perlin Noise in any shape (dimensionality). Works with Flux, SD3 and other 16d latent models.☆34Aug 6, 2024Updated last year
- Optimizing diffusion for production-ready speeds☆37Jan 10, 2026Updated last month
- A simple script to see how my ideas evolve over time☆44Jun 4, 2025Updated 9 months ago
- This repository includes the code to download the curated HuggingFace papers into a single markdown formatted file☆16Jul 26, 2024Updated last year
- [CVPR 2026] Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers☆54Feb 22, 2026Updated last week