dieuroi / SimAesthetics
☆10Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for SimAesthetics
- (ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning☆20Updated last month
- ☆17Updated 3 months ago
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆39Updated 7 months ago
- Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)☆20Updated 2 years ago
- Repo for our NeurIPS 2023 paper on: Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Fee…☆25Updated 11 months ago
- [AAAI 2022 Oral] This is a Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tail…☆33Updated 2 years ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆17Updated last month
- Benchmark data for "Rethinking Benchmarks for Cross-modal Image-text Retrieval" (SIGIR 2023)☆22Updated last year
- This repo holds the Pytorch codes and models for the BTH framework presented on CVPR 2021☆32Updated 3 years ago
- Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models☆44Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆17Updated last month
- [ECCV2022] The PyTorch implementation of paper "Equivariance and Invariance Inductive Bias for Learning from Insufficient Data"☆18Updated 2 years ago
- Official code of "Discover the Unknown Biased Attribute of an Image Classifier" (ICCV 2021)☆19Updated 3 years ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆67Updated 6 months ago
- A reading list of papers about Visual Grounding.☆31Updated 2 years ago
- Official implementation of the Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) | ICCV 2021 - Image Retrieval o…☆36Updated 4 months ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆31Updated last year
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Updated last year
- ☆101Updated last year
- Turning to Video for Transcript Sorting☆46Updated last year
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval"☆11Updated 2 months ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆42Updated 2 years ago
- Source code of our MM'22 paper Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning☆21Updated 4 months ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆107Updated last year
- End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021☆18Updated 3 years ago
- [ACM MM 22] Correspondence Matters for Video Referring Expression Comprehension☆14Updated 2 years ago
- This repo is the official implementation of UPL (Unsupervised Prompt Learning for Vision-Language Models).☆106Updated 2 years ago
- ☆20Updated 2 years ago
- [NeurIPS 2024] Lumen: a Large multimodal model with versatile vision-centric capabilities☆22Updated last month
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆40Updated 4 months ago