An Arena-style Automated Evaluation Benchmark for Detailed Captioning
☆59Jun 1, 2025Updated 11 months ago
Alternatives and similar repositories for CapArena
Users that are interested in CapArena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"☆27Jul 7, 2025Updated 10 months ago
- Official Repo for "Why Settle for One? Text-to-ImageSet Generation and Evaluation"☆21Oct 1, 2025Updated 7 months ago
- Code for Research Project TLDR☆25Jul 28, 2025Updated 9 months ago
- [ACL 2026] Code, benchmark and environment for "OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic…☆47Nov 10, 2025Updated 6 months ago
- ☆13Dec 9, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆24Jun 13, 2023Updated 2 years ago
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆14Mar 6, 2025Updated last year
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆45Dec 19, 2024Updated last year
- ☆40Jan 23, 2024Updated 2 years ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Jun 1, 2025Updated 11 months ago
- What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness☆27May 16, 2025Updated last year
- ☆49May 14, 2026Updated last week
- ☆18Nov 3, 2025Updated 6 months ago
- Repo for Anonymous purpose, pls don't distribute☆10Oct 2, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated 2 years ago
- ☆75Dec 6, 2024Updated last year
- [ICLR 2026] Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆128Feb 2, 2026Updated 3 months ago
- [IJCAI 2025] Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives☆34Nov 25, 2025Updated 5 months ago
- NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation☆13May 24, 2025Updated last year
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆186Oct 8, 2025Updated 7 months ago
- Neural Code Intelligence Survey 2024-25; Reading lists and resources☆282Jul 24, 2025Updated 10 months ago
- PhyX: Does Your Model Have the "Wits" for Physical Reasoning?☆52Mar 16, 2026Updated 2 months ago
- Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning☆57Oct 16, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ACL 2025] A Neural-Symbolic Self-Training Framework☆116Jun 1, 2025Updated 11 months ago
- CaDiCaL + neural glue variable predictions☆10Oct 21, 2020Updated 5 years ago
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆16Feb 22, 2025Updated last year
- ☆12Aug 8, 2024Updated last year
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆66Dec 8, 2024Updated last year
- the official repo for EMNLP 2024 (main) paper "EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimo…☆21Apr 9, 2025Updated last year
- ☆112Jan 8, 2025Updated last year
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Apr 3, 2024Updated 2 years ago
- Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)☆62Feb 13, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆30Nov 24, 2024Updated last year
- 使用torch.distributed实现DP/TP/PP☆15Dec 28, 2023Updated 2 years ago
- [NeurIPS 2024 poster] Cross-model Control: Improving Multiple Large Language Models in One-time Training☆14Oct 25, 2024Updated last year
- [EMNLP 2024 Findings] SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback wit…☆90Jan 18, 2026Updated 4 months ago
- [ICASSP 2025 Oral] The official implementation of paper "TextureDiffusion: Target Prompt Disentangled Editing for Various Texture Transfe…☆16Mar 13, 2025Updated last year
- The second Homework of NLP☆13Jun 9, 2021Updated 4 years ago
- ☆36Mar 10, 2025Updated last year