Official Implementation of OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
☆38Jul 5, 2025Updated 7 months ago
Alternatives and similar repositories for OpenING
Users that are interested in OpenING are comparing it to the libraries listed below
Sorting:
- Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions☆21Feb 11, 2026Updated 2 weeks ago
- ☆13Jan 22, 2025Updated last year
- [CVPR 2025] VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning☆14Jun 7, 2025Updated 8 months ago
- A flexible & scalable MLLM-based AIGC detection pipeline☆28Oct 27, 2025Updated 4 months ago
- [ICCV 2025] Official Implementation of Steering Rectified Flow Models in the Vector Field for Controlled Image Generation☆44Jun 27, 2025Updated 8 months ago
- A Comprehensive Dataset for Advanced Image Generation and Editing}☆31Oct 2, 2025Updated 5 months ago
- This repository organizes the Imagnet1k dataset into 10 coarse classes, where each class consists of semantically similar image categorie…☆22Dec 11, 2023Updated 2 years ago
- 🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code ex…☆41Oct 20, 2025Updated 4 months ago
- Reading list for research topics in embodied vision☆11Dec 5, 2021Updated 4 years ago
- SmartCLIP: A training method to improve CLIP with both short and long texts☆37Jun 18, 2025Updated 8 months ago
- Official repository for CoMM Dataset☆50Dec 31, 2024Updated last year
- ☆26Jun 22, 2024Updated last year
- ☆56Jan 30, 2026Updated last month
- GenExam: A Multidisciplinary Text-to-Image Exam☆56Updated this week
- This is the official repo of OpenSatMap in NeurIPS 2024 D&B Track☆29Jul 6, 2025Updated 7 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆85Jan 21, 2026Updated last month
- Modality Gap–Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models☆51Feb 23, 2026Updated last week
- [ICCV 2025] FonTS: Text Rendering with Typography and Style Controls☆39Nov 5, 2025Updated 3 months ago
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 7 months ago
- Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark☆28Apr 22, 2025Updated 10 months ago
- [ACL2025 Oral & Award] Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible☆121Aug 10, 2025Updated 6 months ago
- ☆31Jun 12, 2024Updated last year
- [CVPR 2025] DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles☆30May 13, 2025Updated 9 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- ☆30Nov 7, 2023Updated 2 years ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 6 months ago
- ☆18Dec 5, 2021Updated 4 years ago
- (ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.☆31Aug 7, 2025Updated 6 months ago
- a unified reinforcement learning toolbox for joint RL on language models and diffusion models☆75Feb 7, 2026Updated 3 weeks ago
- ☆39May 20, 2025Updated 9 months ago
- Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"☆28Jul 15, 2025Updated 7 months ago
- Official implementation of "Art-Free Generative Models: Art Creation Without Graphic Art Knowledge"☆32Nov 30, 2025Updated 3 months ago
- Repo for "Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content"☆40Jun 9, 2025Updated 8 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 6 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- GMAN: Generative Meta-Adversarial Network for Unseen Object Navigation (ECCV 2022)☆23Jan 13, 2024Updated 2 years ago
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆14Feb 25, 2025Updated last year
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year