opendatalab / LOKI
The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”
☆107Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for LOKI
- The official implementation of the paper “Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm”☆26Updated 2 weeks ago
- This is the repo for the paper Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.☆27Updated 3 weeks ago
- Awesome lists about framework figures in papers☆38Updated last month
- The official pytorch implementation of Exploring the Interactive Guidance for Unified and Effective Image Matting☆23Updated 7 months ago
- [ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions☆151Updated 4 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆89Updated last month
- [CVPR 2024🔥] Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization☆91Updated 4 months ago
- 📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.☆207Updated this week
- Explore the Limits of Omni-modal Pretraining at Scale☆89Updated 2 months ago
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆132Updated last month
- The official implementation of the paper: Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs (ICCV 2023)☆40Updated 4 months ago
- The official implementation of "Segment Anything with Multiple Modalities".☆66Updated 2 months ago
- [NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention☆42Updated this week
- Empowering Unified MLLM with Multi-granular Visual Generation☆104Updated 3 weeks ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆46Updated last week
- 🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).☆346Updated last week
- Official implement of MIA-DPO☆32Updated last week
- [CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs☆132Updated 3 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆125Updated 3 months ago
- ✨✨ MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?☆77Updated last month
- 🔥🔥First-ever hour scale video understanding models☆156Updated 2 weeks ago
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models☆98Updated 5 months ago
- Project for "LaSagnA: Language-based Segmentation Assistant for Complex Queries".☆47Updated 6 months ago
- [NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"☆147Updated last month
- Implements VAR+CLIP for image generation☆78Updated 3 months ago
- About The official implementation of the paper "Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network“. (ECCV 2024)☆34Updated last week
- [ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models☆81Updated 2 months ago
- ☆21Updated 3 months ago
- VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".☆81Updated 4 months ago
- Official implementation of the Law of Vision Representation in MLLMs☆128Updated 2 months ago