wyczzy / AIGI-HolmesLinks
(ICCV 2025)This repository is the official implementation of AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models
☆69Updated last week
Alternatives and similar repositories for AIGI-Holmes
Users that are interested in AIGI-Holmes are comparing it to the libraries listed below
Sorting:
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆126Updated last month
- Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing☆69Updated last week
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆12Updated 8 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆101Updated last month
- [NeurIPS2024]☆25Updated 6 months ago
- Official implementation for "Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter"☆40Updated last year
- [CVPR 2025] VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification☆34Updated 3 months ago
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Updated last year
- [CVPR2025] FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression☆45Updated 4 months ago
- Unified layout planning and image generation, ICCV2025☆25Updated 3 months ago
- [CVPRW 2025] UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inpu…☆86Updated 2 months ago
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆141Updated last month
- A collection of vision foundation models unifying understanding and generation.☆56Updated 6 months ago
- Official code for "DiffX: Guide Your Layout to Cross-Modal Generative Modeling"☆22Updated 4 months ago
- ☆21Updated 5 months ago
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆33Updated 3 months ago
- ☆88Updated 3 months ago
- The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"☆42Updated last month
- ☆66Updated 2 months ago
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆27Updated this week
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆123Updated 6 months ago
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆147Updated 4 months ago
- ☆128Updated 2 weeks ago
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆120Updated 7 months ago
- [NeurIPS 2024] COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing☆24Updated 7 months ago
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆80Updated 2 months ago
- Official repository of NeurIPS D&B Track 2024 paper "VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understan…☆35Updated 5 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆256Updated 3 weeks ago
- VisualQuality-R1 is the first open-sourced NR-IQA model can accurately describe and rate the image quality.☆56Updated last month
- The official implementation of A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation☆18Updated 7 months ago