njucckevin / CapArena
An Arena-style Automated Evaluation Benchmark for Detailed Captioning
☆32Updated last month
Alternatives and similar repositories for CapArena
Users that are interested in CapArena are comparing it to the libraries listed below
Sorting:
- A Self-Training Framework for Vision-Language Reasoning☆78Updated 3 months ago
- [Preprint] A Neural-Symbolic Self-Training Framework☆107Updated last month
- ☆55Updated 7 months ago
- ☆18Updated 2 weeks ago
- A Survey on the Honesty of Large Language Models☆57Updated 5 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 5 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆50Updated this week
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆58Updated 4 months ago
- ☆22Updated 10 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆58Updated last year
- This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vi…☆104Updated 7 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆73Updated 6 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆72Updated 3 weeks ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆80Updated last year
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating☆94Updated last year
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆46Updated last year
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆121Updated last month
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆119Updated last month
- ☆43Updated last month
- The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]☆18Updated 4 months ago
- The official code repository for PRMBench.☆73Updated 3 months ago
- ☆13Updated 5 months ago
- ☆73Updated 11 months ago
- [ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…☆53Updated this week
- Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations☆80Updated 10 months ago
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆61Updated 3 months ago
- [ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinatio…☆20Updated 3 months ago
- [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding☆78Updated last month
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)☆20Updated 6 months ago
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning☆162Updated last year