michelecafagna26 / cider
Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming effects. We also add the possibility to replace the original PTBTokenizer with the Spacy tekenizer (No java dependincy but slower)
☆12Updated last year
Alternatives and similar repositories for cider:
Users that are interested in cider are comparing it to the libraries listed below
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆10Updated 3 months ago
- ☆15Updated 2 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆47Updated 3 months ago
- This repo contains the code and data for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks"☆51Updated this week
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆76Updated 6 months ago
- Official code repository for Interleaved Scene Graph.☆13Updated last month
- Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆51Updated 3 weeks ago
- ☆18Updated 8 months ago
- Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆58Updated 6 months ago
- A hot-pluggable tool for visualizing LLaVA's attention.☆13Updated 11 months ago
- ☆134Updated 2 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆61Updated last month
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆28Updated 6 months ago
- Visualizing the attention of vision-language models☆97Updated 2 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆94Updated 10 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆120Updated 3 months ago
- ☆59Updated 7 months ago
- ☆19Updated 2 months ago
- ☆38Updated last year
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆43Updated 5 months ago
- ☆39Updated 2 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆61Updated 7 months ago
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆126Updated last year
- MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale☆29Updated last month
- ☆94Updated last year
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆42Updated last year
- ☆42Updated 5 months ago
- Code for the paper "AutoPresent: Designing Structured Visuals From Scratch"☆41Updated last week
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆37Updated 3 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆112Updated 6 months ago