njucckevin/CapArena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/njucckevin/CapArena)

njucckevin / CapArena

An Arena-style Automated Evaluation Benchmark for Detailed Captioning

☆59

Alternatives and similar repositories for CapArena

Users that are interested in CapArena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

njucckevin / OpenMobile-Code
View on GitHub
The model, data and code for OpenMobile
☆50Jul 9, 2026Updated 2 weeks ago
xufangzhi / Odyssey-Arena
View on GitHub
Extremely Long-Horizon Agentic Tasks Requiring Active Acting and Inductive Reasoning
☆33Feb 9, 2026Updated 5 months ago
xcltql666 / DenseDiT
View on GitHub
Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"
☆27Jun 7, 2026Updated last month
Liac-li / MM-self-improve-qwen2vl
View on GitHub
☆13Dec 9, 2024Updated last year
chengyou-jia / T2IS
View on GitHub
Official Repo for "Why Settle for One? Text-to-ImageSet Generation and Evaluation"
☆21Oct 1, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yayayacc / TIDE
View on GitHub
☆18Feb 4, 2026Updated 5 months ago
InternLM / JanusCoder
View on GitHub
[ICLR 2026] JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence
☆78May 9, 2026Updated 2 months ago
wjn1996 / Chain-of-Knowledge
View on GitHub
☆24Jun 13, 2023Updated 3 years ago
yayayacc / MUR
View on GitHub
☆49May 14, 2026Updated 2 months ago
MAGAer13 / DeCapBench
View on GitHub
Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)
☆14Mar 6, 2025Updated last year
chengyou-jia / AgentStore
View on GitHub
[ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant
☆46Dec 19, 2024Updated last year
OS-Copilot / ScienceBoard
View on GitHub
[ICLR 2026] Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"
☆132Feb 2, 2026Updated 5 months ago
njucckevin / MM-Self-Improve
View on GitHub
A Self-Training Framework for Vision-Language Reasoning
☆90Jan 23, 2025Updated last year
xufangzhi / Genius
View on GitHub
[ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework
☆72Jun 1, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
hkust-nlp / GUIMid
View on GitHub
☆22May 3, 2025Updated last year
X-GenGroup / PaCo-RL
View on GitHub
Official Implementation for *PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling*
☆42Dec 13, 2025Updated 7 months ago
ali-vilab / CAPability
View on GitHub
What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
☆28May 16, 2025Updated last year
chengyou-jia / ChatGen
View on GitHub
[CVPR 2025] ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting
☆33Dec 5, 2024Updated last year
MuyeHuang / EvoChart
View on GitHub
☆19Nov 3, 2025Updated 8 months ago
njucckevin / KnowCap
View on GitHub
Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
☆13Feb 15, 2024Updated 2 years ago
OS-Copilot / OS-Symphony
View on GitHub
[ACL 2026 Main] Official repository for paper: OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agents
☆48Apr 7, 2026Updated 3 months ago
CoopReason / TESSY
View on GitHub
A Teacher–Student Cooperation Framework to Synthesize Student-Consistent SFT Data
☆34May 1, 2026Updated 2 months ago
cheliu-computation / AlphaMed-NeurIPSW
View on GitHub
Unleashing Reasoning in Medical Large Language Models
☆12Mar 19, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
dxzxy12138 / PhysReason
View on GitHub
PhysReason Becnhmark
☆19Jul 8, 2025Updated last year
CONE-MT / LLaMAX
View on GitHub
☆75Dec 6, 2024Updated last year
xufangzhi / Symbol-LLM
View on GitHub
[ACL 2024] The project of Symbol-LLM
☆59Jul 10, 2024Updated 2 years ago
njucckevin / SeeClick
View on GitHub
The model, data and code for the visual GUI Agent SeeClick
☆493Jul 13, 2025Updated last year
OS-Copilot / OS-Genesis
View on GitHub
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆188Oct 8, 2025Updated 9 months ago
jonathan-roberts1 / SciFIBench
View on GitHub
NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
☆13May 24, 2025Updated last year
wzz618 / wozaixiaoyuan
View on GitHub
我在校园的各项API，自动运行脚本，支持多人
☆12Jun 28, 2022Updated 4 years ago
LuLuLuyi / LongHeads
View on GitHub
[EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor
☆32Apr 8, 2024Updated 2 years ago
xufangzhi / ENVISIONS
View on GitHub
[ACL 2025] A Neural-Symbolic Self-Training Framework
☆117Jun 1, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xufangzhi / phi-Decoding
View on GitHub
[ACL 2025] An inference-time decoding strategy with adaptive foresight sampling
☆107May 18, 2025Updated last year
Mark-Sky / KCL
View on GitHub
Implement of 'The Devil is in the Few Shots: Iterative Visual Knowledge Completion for Few-shot Learning'
☆13Nov 22, 2024Updated last year
jesse-michael-han / neuro-cadical
View on GitHub
CaDiCaL + neural glue variable predictions
☆10Oct 21, 2020Updated 5 years ago
aburns4 / textualforesight
View on GitHub
☆12Aug 8, 2024Updated last year
starreeze / efuf
View on GitHub
the official repo for EMNLP 2024 (main) paper "EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimo…
☆21Apr 9, 2025Updated last year
Kitware / generated-image-detection
View on GitHub
☆16Mar 6, 2024Updated 2 years ago
francescortu / comp-mech
View on GitHub
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals; ACL 2024
☆13May 24, 2024Updated 2 years ago