[CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
☆34May 25, 2025Updated last year
Alternatives and similar repositories for Polos
Users that are interested in Polos are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Imagen-mini for girl image generation☆12Nov 19, 2022Updated 3 years ago
- [ACM MM 2024] See or Guess: Counterfactually Regularized Image Captioning☆16Feb 17, 2025Updated last year
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆16Feb 22, 2025Updated last year
- This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision…☆32Mar 12, 2024Updated 2 years ago
- Data release for the ImageInWords (IIW) paper.☆225Nov 17, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ECCV24] Layer-Wise Relevance Propagation with Conservation Property for ResNet☆15Sep 20, 2024Updated last year
- This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language M…☆24Apr 27, 2025Updated last year
- Code Repository for CausalDiffAE (ECAI 2024)☆22Oct 19, 2024Updated last year
- This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …☆71Dec 20, 2021Updated 4 years ago
- ☆27Feb 3, 2023Updated 3 years ago
- Related papers about Referring Image Segmentation (RIS)☆16Dec 26, 2023Updated 2 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 5 months ago
- ☆65Feb 5, 2024Updated 2 years ago
- ☆47Aug 26, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurI…☆94Apr 29, 2024Updated 2 years ago
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆296Jun 7, 2023Updated 3 years ago
- Code and data for ImageCoDe, a contextual vison-and-language benchmark☆41Mar 1, 2024Updated 2 years ago
- The first unofficial implementation of CLIP4Caption: CLIP for Video Caption (ACMMM 2021)☆16Jan 2, 2023Updated 3 years ago
- ☆193Oct 28, 2024Updated last year
- Repo from the "Learning with limited labeled data" seminar @ Uni of Tuebingen. A collection of notes, notebooks and slideshows to underst…☆17Apr 13, 2023Updated 3 years ago
- NegCLIP.☆41Feb 6, 2023Updated 3 years ago
- (NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductive few-shot classification"☆31Nov 21, 2021Updated 4 years ago
- [ICME 2019] Source code and datasets for "Semi-supervised Compatibility Learning Across Categories for Clothing Matching"☆11Apr 26, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- final-project-level3-nlp-02 created by GitHub Classroom☆11Dec 31, 2021Updated 4 years ago
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- PyTorch implementation of "PatchVAE: Learning Local Latent Codes for Recognition" to appear in CVPR 2020☆14Apr 9, 2020Updated 6 years ago
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆14Oct 12, 2021Updated 4 years ago
- Touchstone: Evaluating Vision-Language Models by Language Models☆83Jan 18, 2024Updated 2 years ago
- M-HalDetect Dataset Release☆29Nov 4, 2023Updated 2 years ago
- ☆15May 13, 2024Updated 2 years ago
- [TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation☆14Sep 14, 2023Updated 2 years ago
- A list of papers and other resources on language-guided image editing.☆39Jan 13, 2021Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- The source codes for Region Comparison Network for Interpretable Few-shot Image Classification☆10Sep 17, 2020Updated 5 years ago
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆36Aug 8, 2024Updated last year
- Official Source code of "One-Shot Adaptation of GAN in Just One CLIP" IEEE Transactions on Pattern Anaylsis and Machine Intelligence (TPA…☆66Jun 5, 2023Updated 3 years ago
- ☆11Oct 9, 2022Updated 3 years ago
- ☆11Jan 19, 2025Updated last year
- A currency rate converter App.☆15Sep 5, 2019Updated 6 years ago
- An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)☆87Apr 10, 2022Updated 4 years ago