[CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
☆33May 25, 2025Updated 10 months ago
Alternatives and similar repositories for Polos
Users that are interested in Polos are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Imagen-mini for girl image generation☆12Nov 19, 2022Updated 3 years ago
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆16Feb 22, 2025Updated last year
- [NeurIPS 2023] A faithful benchmark for vision-language compositionality☆89Feb 13, 2024Updated 2 years ago
- Data release for the ImageInWords (IIW) paper.☆227Nov 17, 2024Updated last year
- Densely Captioned Images (DCI) dataset repository.☆198Jul 1, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ECCV24] Layer-Wise Relevance Propagation with Conservation Property for ResNet☆15Sep 20, 2024Updated last year
- LLaVA-JP is a Japanese VLM trained by LLaVA method☆64Jul 3, 2024Updated last year
- PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021☆24Jun 4, 2021Updated 4 years ago
- ☆18Sep 13, 2023Updated 2 years ago
- This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …☆71Dec 20, 2021Updated 4 years ago
- ☆26Feb 3, 2023Updated 3 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 3 months ago
- ☆65Feb 5, 2024Updated 2 years ago
- Official implementation of TagAlign☆37Dec 11, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆292Jun 7, 2023Updated 2 years ago
- Code and data for ImageCoDe, a contextual vison-and-language benchmark☆41Mar 1, 2024Updated 2 years ago
- Repo from the "Learning with limited labeled data" seminar @ Uni of Tuebingen. A collection of notes, notebooks and slideshows to underst…☆17Apr 13, 2023Updated 2 years ago
- [ICME 2019] Source code and datasets for "Semi-supervised Compatibility Learning Across Categories for Clothing Matching"☆10Apr 26, 2024Updated last year
- final-project-level3-nlp-02 created by GitHub Classroom☆11Dec 31, 2021Updated 4 years ago
- ☆17Nov 4, 2022Updated 3 years ago
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- ☆29Sep 12, 2022Updated 3 years ago
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆14Oct 12, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Touchstone: Evaluating Vision-Language Models by Language Models☆83Jan 18, 2024Updated 2 years ago
- [ACL 2023 Findings] FACTUAL dataset, the textual scene graph parser trained on FACTUAL.☆127Mar 23, 2026Updated last week
- M-HalDetect Dataset Release☆28Nov 4, 2023Updated 2 years ago
- ☆15May 13, 2024Updated last year
- [TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation☆14Sep 14, 2023Updated 2 years ago
- A list of papers and other resources on language-guided image editing.☆39Jan 13, 2021Updated 5 years ago
- The source codes for Region Comparison Network for Interpretable Few-shot Image Classification☆10Sep 17, 2020Updated 5 years ago
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆36Aug 8, 2024Updated last year
- [ICCV 2025 Highlight] LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs☆20Nov 16, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [EMNLP 24] Official Implementation of CLEANGEN: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models☆19Mar 9, 2025Updated last year
- Official Source code of "One-Shot Adaptation of GAN in Just One CLIP" IEEE Transactions on Pattern Anaylsis and Machine Intelligence (TPA…☆66Jun 5, 2023Updated 2 years ago
- ☆11Oct 9, 2022Updated 3 years ago
- An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)☆87Apr 10, 2022Updated 3 years ago
- ☆11Jan 19, 2025Updated last year
- ☆11Jan 16, 2024Updated 2 years ago
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆102Oct 23, 2024Updated last year