ml-jku/semantic-image-text-alignment

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ml-jku/semantic-image-text-alignment)

ml-jku / semantic-image-text-alignment

☆25

Alternatives and similar repositories for semantic-image-text-alignment

Users that are interested in semantic-image-text-alignment are comparing it to the libraries listed below

Sorting:

junyangwang0410 / Knight
View on GitHub
SotA text-only image/video method (IJCAI 2023)
☆16Jan 9, 2024Updated 2 years ago
ahmedssabir / Belief-Revision-Score
View on GitHub
Belief Revision based Caption Re-ranker with Visual Semantic Information. COLING 2022
☆11Apr 13, 2025Updated 10 months ago
uqzhichen / Awesome-compositional-zero-shot-learning
View on GitHub
Paper list of compositional zero-shot learning
☆11Jul 5, 2022Updated 3 years ago
tttyuntian / vlm_lexical_grounding
View on GitHub
PyTorch code for the Findings of EMNLP 2021 paper "Does Vision-and-Language Pretraining Improve Lexical Grounding?"
☆11Sep 26, 2021Updated 4 years ago
manantomar / video-occupancy-models
View on GitHub
☆12Jul 16, 2024Updated last year
princetonvisualai / SPICE-U
View on GitHub
☆11Sep 7, 2020Updated 5 years ago
facebookresearch / synlm
View on GitHub
Code for paper: "Privately generating tabular data using language models".
☆15Jun 13, 2023Updated 2 years ago
RitaRamo / extra
View on GitHub
Retrieval-augmented Image Captioning
☆13Feb 16, 2023Updated 3 years ago
Lexsi-Labs / aligntune
View on GitHub
Aligntune : A Modular Toolkit for Post Training Alignment of LLMs
☆35Feb 26, 2026Updated last week
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
KaliYuga-ai / Lithography-Diffusion
View on GitHub
☆14Jul 30, 2022Updated 3 years ago
facebookresearch / stepdiff
View on GitHub
Data release for Step Differences in Instructional Video (CVPR24)
☆14Jun 19, 2024Updated last year
chenchenzi / HKCantonese_models
View on GitHub
This is a repository dedicated for pre-trained acoustic models of Hong Kong Cantonese and Cantonese forced alignment.
☆23Nov 14, 2024Updated last year
jaisidhsingh / CoN-CLIP
View on GitHub
Implementation of the "Learn No to Say Yes Better" paper.
☆39Oct 30, 2025Updated 4 months ago
zihuixue / AlignEgoExo
View on GitHub
Code and data release for the paper "Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Align…
☆19Apr 5, 2024Updated last year
sunlab-osu / IterPrompt
View on GitHub
☆19Nov 7, 2022Updated 3 years ago
sunwei925 / DN-PIQA
View on GitHub
Dual-Branch Network for Portrait Image Quality Assessment
☆18Sep 16, 2025Updated 5 months ago
Lihr747 / CgtGAN
View on GitHub
☆20May 3, 2025Updated 10 months ago
BrandonHanx / CompFashion
View on GitHub
[CVPR(W) 2022] UIGR: Unified Interactive Garment Retrieval
☆22Dec 3, 2021Updated 4 years ago
facebookresearch / genecis
View on GitHub
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆61Jun 12, 2023Updated 2 years ago
Code-kunkun / ZS-CIR
View on GitHub
[BMVC 2023] Zero-shot Composed Text-Image Retrieval
☆55Nov 26, 2024Updated last year
mightyzau / RegionBLIP
View on GitHub
☆58Aug 7, 2023Updated 2 years ago
pyladiesams / conformal-prediction-jan2024
View on GitHub
An introduction to conformal prediction
☆27Jan 31, 2024Updated 2 years ago
facebookresearch / PostText
View on GitHub
PostText is a QA system for querying your text data. When appropriate structured views are in place, PostText is good at answering querie…
☆31Jun 14, 2023Updated 2 years ago
ananyathomas / Vecna
View on GitHub
Vecna is a Python chatbot which recommends songs and movies depending upon your feelings
☆12Jun 28, 2022Updated 3 years ago
aimagelab / pacscore
View on GitHub
[CVPR 2023 & IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
☆65Jul 29, 2025Updated 7 months ago
goel-shashank / CyCLIP
View on GitHub
☆125Feb 21, 2023Updated 3 years ago
eric-ai-lab / ComCLIP
View on GitHub
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
☆37Aug 18, 2024Updated last year
Saehyung-Lee / PlugIR
View on GitHub
Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)
☆34Mar 24, 2025Updated 11 months ago
facebookresearch / VidOSC
View on GitHub
Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)
☆35Sep 9, 2024Updated last year
emu1729 / GIST
View on GitHub
Generating Image Specific Text
☆29Aug 14, 2023Updated 2 years ago
Mu-Y / mpl-mdd
View on GitHub
[Interspeech22]Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Ass…
☆34Jan 23, 2024Updated 2 years ago
ExplainableML / Vision_by_Language
View on GitHub
[ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"
☆84Jul 4, 2024Updated last year
gisilvs / AEF
View on GitHub
☆33Mar 1, 2023Updated 3 years ago
sdc17 / CrossGET
View on GitHub
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
☆34Dec 30, 2024Updated last year
RAIVNLab / CREPE
View on GitHub
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆35Apr 27, 2023Updated 2 years ago
dmarx / zero-shot-intent-classifier
View on GitHub
Minimal zero-shot intent classifier for arbitrary intent slot filling, via LLM prompting w LangChain.
☆37Mar 13, 2023Updated 2 years ago
researchmm / soho
View on GitHub
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
☆208Sep 30, 2022Updated 3 years ago
yangbang18 / MultiCapCLIP
View on GitHub
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
☆36Aug 8, 2024Updated last year