NVlabs/PALAVRA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVlabs/PALAVRA)

NVlabs / PALAVRA

☆54

Alternatives and similar repositories for PALAVRA

Users that are interested in PALAVRA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NVlabs / PerVLBenchmark
View on GitHub
☆11Jul 31, 2022Updated 3 years ago
mlfoundations / clip_quality_not_quantity
View on GitHub
☆28Oct 18, 2022Updated 3 years ago
ABaldrati / CLIP4CirDemo
View on GitHub
[CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based features
☆86Nov 12, 2024Updated last year
miccunifi / SEARLE
View on GitHub
[ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion
☆198Jul 31, 2025Updated 11 months ago
arijitray1993 / COLA
View on GitHub
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25May 14, 2026Updated 2 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
google-research / composed_image_retrieval
View on GitHub
☆197May 9, 2026Updated 2 months ago
SMILE-data / SMILE
View on GitHub
SMILE: A Multimodal Dataset for Understanding Laughter
☆13Jun 15, 2023Updated 3 years ago
jaeseokbyun / GRIT-VLP
View on GitHub
This is an official implementation of GRIT-VLP
☆20Aug 8, 2022Updated 3 years ago
jiyounglee-0523 / FourierDecoder
View on GitHub
Official repository for Fourier model that can generate periodic signals
☆10Mar 10, 2022Updated 4 years ago
danielchyeh / this-is-my
View on GitHub
Official This-Is-My Dataset published in CVPR 2023
☆16Jul 18, 2024Updated 2 years ago
ilkerkesen / ViLMA
View on GitHub
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Jan 18, 2024Updated 2 years ago
facebookresearch / reliable_vqa
View on GitHub
Implementation for the paper "Reliable Visual Question Answering Abstain Rather Than Answer Incorrectly" (ECCV 2022: https//arxiv.org/abs…
☆41May 19, 2023Updated 3 years ago
dmoltisanti / air-cvpr23
View on GitHub
This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…
☆13May 25, 2023Updated 3 years ago
ylsung / VL_adapter
View on GitHub
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)
☆212Dec 18, 2022Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
OmkarThawakar / composed-video-retrieval
View on GitHub
Composed Video Retrieval
☆62May 2, 2024Updated 2 years ago
tgisaturday / dalle-lightning
View on GitHub
Refactoring dalle-pytorch and taming-transformers for TPU VM
☆60Aug 30, 2021Updated 4 years ago
miccunifi / CIRCO
View on GitHub
[ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset
☆87Aug 6, 2025Updated 11 months ago
navervision / KELIP
View on GitHub
Official PyTorch implementation of "Large-scale Bilingual Language-Image Contrastive Learning" (ICLRW 2022)
☆96Apr 13, 2022Updated 4 years ago
navervision / lincir
View on GitHub
Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)
☆148Jan 5, 2026Updated 6 months ago
MikeWangWZHL / Paxion
View on GitHub
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
☆38May 23, 2023Updated 3 years ago
UCSB-NLP-Chang / DiffusionDisentanglement
View on GitHub
Official implementation of the paper "Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
☆175Oct 8, 2023Updated 2 years ago
google-research-datasets / videoCC-data
View on GitHub
VideoCC is a dataset containing (video-URL, caption) pairs for training video-text machine learning models. It is created using an automa…
☆78Dec 5, 2022Updated 3 years ago
SivanDoveh / TSVLC
View on GitHub
Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models
☆47Sep 25, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
LooperXX / ManagerTower
View on GitHub
Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
☆12Aug 23, 2025Updated 10 months ago
McGill-NLP / imagecode
View on GitHub
Code and data for ImageCoDe, a contextual vison-and-language benchmark
☆42Mar 1, 2024Updated 2 years ago
mertyg / vision-language-models-are-bows
View on GitHub
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …
☆294Jun 7, 2023Updated 3 years ago
ABaldrati / CLIP4Cir
View on GitHub
[ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features
☆195Sep 5, 2023Updated 2 years ago
adham-elarabawy / direct-inversion
View on GitHub
Official code implementation for our paper -- Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models.
☆27Nov 18, 2022Updated 3 years ago
kdariina / CLIP-not-BoW-unimodally
View on GitHub
Code for "CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally"
☆29Feb 27, 2026Updated 4 months ago
RAIVNLab / CREPE
View on GitHub
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆35Apr 27, 2023Updated 3 years ago
shuheikurita / RefEgo
View on GitHub
☆13Jul 20, 2024Updated 2 years ago
yuhui-zh15 / drml
View on GitHub
Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)
☆34Jun 8, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ml-jku / semantic-image-text-alignment
View on GitHub
☆25Jul 10, 2023Updated 3 years ago
florianHofherr / PhysParamInference
View on GitHub
☆19Jan 30, 2023Updated 3 years ago
princetonvisualai / imagecaptioning-bias
View on GitHub
Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"
☆12Mar 26, 2026Updated 3 months ago
multimodal-art-projection / IV-Bench
View on GitHub
☆14Apr 23, 2025Updated last year
ddehun / DEnsity
View on GitHub
Official repository for "DEnsity: Open-domain Dialogue Evaluation Metric using Density Estimation (ACL2023 Findings)"
☆11May 23, 2023Updated 3 years ago
bpiyush / TestOfTime
View on GitHub
Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time
☆46Jun 11, 2024Updated 2 years ago
eisneim / clip-vip_video_search
View on GitHub
showing how to use CLIP-Vip to do video search
☆16Nov 16, 2023Updated 2 years ago