An Arena-style Automated Evaluation Benchmark for Detailed Captioning
☆57Jun 1, 2025Updated 9 months ago
Alternatives and similar repositories for CapArena
Users that are interested in CapArena are comparing it to the libraries listed below
Sorting:
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"☆28Jul 7, 2025Updated 7 months ago
- Official Repo for "Why Settle for One? Text-to-ImageSet Generation and Evaluation"☆21Oct 1, 2025Updated 5 months ago
- Code for Research Project TLDR☆25Jul 28, 2025Updated 7 months ago
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆12Mar 6, 2025Updated 11 months ago
- ☆24Jun 13, 2023Updated 2 years ago
- ☆39Jan 23, 2024Updated 2 years ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Jun 1, 2025Updated 9 months ago
- Repo for Anonymous purpose, pls don't distribute☆10Oct 2, 2024Updated last year
- ☆47Oct 2, 2025Updated 5 months ago
- 收集LUG@NJU群的精华消息,好玩就行。☆12Jun 22, 2022Updated 3 years ago
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Apr 3, 2024Updated last year
- ☆17Nov 3, 2025Updated 4 months ago
- Implementation and checkpoints of Imagen, Google's text-to-image synthesis neural network, in Pytorch☆17Dec 22, 2022Updated 3 years ago
- [ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA bench…☆87Jan 26, 2026Updated last month
- CLAIR: A (surprisingly) simple semantic text metric with large language models.☆22Jan 28, 2024Updated 2 years ago
- The offical repository of "So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection"☆28Oct 29, 2025Updated 4 months ago
- OneFlow Serving☆21Apr 10, 2025Updated 10 months ago
- [IJCAI 2025] Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives☆29Nov 25, 2025Updated 3 months ago
- ☆111Jan 8, 2025Updated last year
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆108May 18, 2025Updated 9 months ago
- Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer☆136Oct 14, 2025Updated 4 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆180Oct 8, 2025Updated 4 months ago
- Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).☆159Sep 27, 2024Updated last year
- [ECCV-2022] The First Unified End-to-End System for Panoptic Part Segmentation☆63Sep 2, 2024Updated last year
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆64Dec 8, 2024Updated last year
- Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning☆57Oct 16, 2025Updated 4 months ago
- Neural Code Intelligence Survey 2024-25; Reading lists and resources☆281Jul 24, 2025Updated 7 months ago
- [ICLR 2026] JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence☆76Feb 9, 2026Updated 3 weeks ago
- PhyX: Does Your Model Have the "Wits" for Physical Reasoning?☆51Feb 15, 2026Updated 2 weeks ago
- ☆36Feb 6, 2026Updated 3 weeks ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆30Nov 24, 2024Updated last year
- Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"☆123Oct 2, 2025Updated 5 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Dec 29, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 3 months ago
- Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation☆12Feb 16, 2025Updated last year
- A more efficient yolov5 with oneflow backend 🎉🎉🎉☆217Jul 10, 2025Updated 7 months ago
- Unofficial implement of "Pix2seq: A Language Modeling Framework for Object Detection" on mmdetection☆33Apr 18, 2022Updated 3 years ago
- A highlight tool for reading ArXiv papers☆31May 30, 2021Updated 4 years ago