yuanc3 / DATELinks
Use 2 lines to empower absolute time awareness for Qwen2.5VL's MRoPE
☆18Updated this week
Alternatives and similar repositories for DATE
Users that are interested in DATE are comparing it to the libraries listed below
Sorting:
- The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"☆58Updated 3 months ago
- FakeVLM: Advancing Synthetic Image Detection through Explainable Multimodal Models and Fine-Grained Artifact Analysis☆72Updated 3 months ago
- (ICCV 2025)This repository is the official implementation of AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detect…☆118Updated 2 months ago
- ☆17Updated 7 months ago
- ☆70Updated 5 months ago
- Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆63Updated 4 months ago
- The offical implementation of 'FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant'☆44Updated 10 months ago
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆262Updated 5 months ago
- [ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation☆196Updated 5 months ago
- [KDD2025] Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective☆80Updated 2 weeks ago
- The official implementation of RAR☆93Updated last year
- [CVPRW 2025] UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inpu…☆92Updated 4 months ago
- [ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models☆96Updated 2 months ago
- New generation of CLIP with fine grained discrimination capability, ICML2025☆297Updated last week
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆149Updated 2 months ago
- [ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling☆124Updated 3 weeks ago
- ☆83Updated last month
- The offical repository of "SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model"☆95Updated last month
- This repository is the code of paper 'DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark'.☆135Updated 8 months ago
- [ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation,☆39Updated 6 months ago
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆69Updated last month
- ☆120Updated 6 months ago
- The Next Step Forward in Multimodal LLM Alignment☆179Updated 4 months ago
- Unified Multi-modal IAA Baseline and Benchmark☆85Updated 11 months ago
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆36Updated 6 months ago
- ✨✨Latest Papers on AI-Generated Video Detection and Related Areas☆83Updated 2 weeks ago
- [CVPR2025] Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters☆39Updated 6 months ago
- Official codes for "Q-Ground: Image Quality Grounding with Large Multi-modality Models", ACM MM2024 (Oral)☆42Updated 10 months ago
- [ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions☆234Updated last year
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆153Updated 6 months ago