opendatalab / LOKILinks

[ICLR 2025 Spotlight] The official implementation of the paper “LOKI：A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”

☆154

Alternatives and similar repositories for LOKI

Users that are interested in LOKI are comparing it to the libraries listed below

Sorting:

opendatalab / LEGION
The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"
☆42Updated last month
opendatalab / FakeVLM
FakeVLM: Advancing Synthetic Image Detection through Explainable Multimodal Models and Fine-Grained Artifact Analysis
☆55Updated last month
Cooperx521 / ScaleCap
Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’
☆50Updated 3 weeks ago
mrwu-mac / ControlMLLM
[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
☆184Updated last week
Hoar012 / RAP-MLLM
[CVPR 2025] RAP: Retrieval-Augmented Personalization
☆64Updated 3 weeks ago
shikiw / Modality-Integration-Rate
[ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…
☆101Updated last week
JiuTian-VL / MoME
[NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
☆68Updated 2 months ago
yu-rp / VisualPerceptionToken
☆88Updated 3 months ago
saccharomycetes / mllms_know
[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'
☆230Updated 2 months ago
opendatalab / UrBench
[AAAI 2025]This repo contains evaluation code for the paper “UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in…
☆31Updated 3 months ago
xmed-lab / TAM
[ICCV 2025] Token Activation Map to Visually Explain Multimodal LLMs
☆40Updated 2 weeks ago
ZhangXJ199 / TinyLLaVA-Video-R1
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
☆83Updated last month
PKU-ICST-MIPL / Finedefics_ICLR2025
☆66Updated 2 months ago
MME-Benchmarks / MME-RealWorld
✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
☆128Updated 4 months ago
threegold116 / Awesome-Omni-MLLMs
This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels
☆40Updated 2 weeks ago
minglllli / CLS-RL
Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
☆50Updated last month
Wang-Xiaodong1899 / CVPR25-MLLM-Paper-List
🔥CVPR 2025 Multimodal Large Language Models Paper List
☆147Updated 4 months ago
1zhou-Wang / MemVR
[ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…
☆141Updated last week
TIGER-AI-Lab / Pixel-Reasoner
Pixel-Level Reasoning Model trained with RL
☆167Updated 2 weeks ago
Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
☆66Updated 3 months ago
ShareGPT4Omni / ShareGPT4V
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
☆225Updated last year
zjunlp / Deco
[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
☆89Updated 7 months ago
Yxxxb / VoCo-LLaMA
[CVPR'2025] VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".
☆176Updated 3 weeks ago
xing0047 / cca-llava
[NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention
☆57Updated 6 months ago
Cooperx521 / PyramidDrop
(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
☆115Updated 4 months ago
Purshow / Awesome-Unified-Multimodal
📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.
☆256Updated 3 weeks ago
zhangquanchen / VisRL
[ICCV 2025] VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
☆32Updated last month
Kwai-YuanQi / MM-RLHF
The Next Step Forward in Multimodal LLM Alignment
☆170Updated 2 months ago
Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆59Updated 5 months ago
swordlidev / Evaluation-Multimodal-LLMs-Survey
A Survey on Benchmarks of Multimodal Large Language Models
☆120Updated 2 weeks ago