YuigaWada/Polos

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YuigaWada/Polos)

YuigaWada / Polos

[CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning

☆33

Alternatives and similar repositories for Polos

Users that are interested in Polos are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yebin46 / FLEUR
View on GitHub
[ACL 2024] FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model
☆17Apr 28, 2025Updated last year
alfredplpl / imagen-mini-girl
View on GitHub
Imagen-mini for girl image generation
☆12Nov 19, 2022Updated 3 years ago
Aman-4-Real / See-or-Guess
View on GitHub
[ACM MM 2024] See or Guess: Counterfactually Regularized Image Captioning
☆16Feb 17, 2025Updated last year
k1000dai / AlohaScorpion
View on GitHub
☆17Feb 18, 2026Updated 5 months ago
SALT-NLP / PersuationGames
View on GitHub
[ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…
☆16Feb 22, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Heidelberg-NLP / MM-SHAP
View on GitHub
This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision…
☆32Jul 14, 2026Updated 2 weeks ago
RAIVNLab / sugar-crepe
View on GitHub
[NeurIPS 2023] A faithful benchmark for vision-language compositionality
☆94Feb 13, 2024Updated 2 years ago
google / imageinwords
View on GitHub
Data release for the ImageInWords (IIW) paper.
☆224Nov 17, 2024Updated last year
facebookresearch / DCI
View on GitHub
Densely Captioned Images (DCI) dataset repository.
☆197Jul 1, 2024Updated 2 years ago
keio-smilab24 / LRP-for-ResNet
View on GitHub
[ECCV24] Layer-Wise Relevance Propagation with Conservation Property for ResNet
☆15Sep 20, 2024Updated last year
CUMTGG / CIIC
View on GitHub
☆18Sep 13, 2023Updated 2 years ago
kampta / PatchGame
View on GitHub
PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021
☆24Jun 4, 2021Updated 5 years ago
csebuetnlp / IllusionVQA
View on GitHub
This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language M…
☆24Apr 27, 2025Updated last year
ubc-vision / IterativeSG
View on GitHub
☆27Feb 3, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facebookresearch / OTTER
View on GitHub
This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …
☆71Dec 20, 2021Updated 4 years ago
Huntersxsx / RIS-Learning-List
View on GitHub
Related papers about Referring Image Segmentation (RIS)
☆16Dec 26, 2023Updated 2 years ago
facebookresearch / diht
View on GitHub
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
☆141Dec 16, 2025Updated 7 months ago
Qinying-Liu / TagAlign
View on GitHub
Official implementation of TagAlign
☆37Dec 11, 2024Updated last year
thunlp / Muffin
View on GitHub
☆65Feb 5, 2024Updated 2 years ago
keanudicap / MSQA
View on GitHub
Microsoft question-answering dataset
☆10Jun 16, 2023Updated 3 years ago
bethgelab / frequency_determines_performance
View on GitHub
Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurI…
☆94Apr 29, 2024Updated 2 years ago
naver-ai / mid.metric
View on GitHub
☆30Jan 3, 2023Updated 3 years ago
mertyg / vision-language-models-are-bows
View on GitHub
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …
☆294Jun 7, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
dahyun-kang / cub-200-2011-part-visualizer
View on GitHub
Visualization tool for CUB-200-2011 part keypoints (Wah et al.).
☆10Sep 17, 2021Updated 4 years ago
learn2phoenix / CSD
View on GitHub
☆197Oct 28, 2024Updated last year
zhaohengyuan1 / Genixer
View on GitHub
(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator
☆116Mar 21, 2025Updated last year
CRIPAC-DIG / SCGAN
View on GitHub
[ICME 2019] Source code and datasets for "Semi-supervised Compatibility Learning Across Categories for Clothing Matching"
☆11Apr 26, 2024Updated 2 years ago
EdiBERT4ImageManipulation / EdiBERT
View on GitHub
☆17Nov 4, 2022Updated 3 years ago
erosenfeld / disagree_discrep
View on GitHub
Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.
☆10Feb 27, 2024Updated 2 years ago
hendryx-scale / mhal-detect
View on GitHub
M-HalDetect Dataset Release
☆30Nov 4, 2023Updated 2 years ago
boostcampaitech2 / final-project-level3-nlp-02
View on GitHub
final-project-level3-nlp-02 created by GitHub Classroom
☆11Dec 31, 2021Updated 4 years ago
mayu-ot / oc-cost
View on GitHub
☆30Sep 12, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kampta / PatchVAE
View on GitHub
PyTorch implementation of "PatchVAE: Learning Local Latent Codes for Recognition" to appear in CVPR 2020
☆14Apr 9, 2020Updated 6 years ago
Akomand / CausalDiffAE
View on GitHub
Code Repository for CausalDiffAE (ECAI 2024)
☆26Oct 19, 2024Updated last year
facebookresearch / DVDialogues
View on GitHub
Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue
☆14Oct 12, 2021Updated 4 years ago
p1atdev / safemetadata
View on GitHub
☆12Jul 6, 2026Updated 3 weeks ago
aimagelab / awesome-captioning-evaluation
View on GitHub
[IJCAI 2025] Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
☆36Nov 25, 2025Updated 8 months ago
zhuang-li / FactualSceneGraph
View on GitHub
[ACL 2023 Findings] FACTUAL dataset, the textual scene graph parser trained on FACTUAL.
☆131Jun 15, 2026Updated last month
Robertwyq / Object-Affinity
View on GitHub
[TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation
☆14Sep 14, 2023Updated 2 years ago