dieuroi / SimAesthetics
☆10Updated 10 months ago
Alternatives and similar repositories for SimAesthetics:
Users that are interested in SimAesthetics are comparing it to the libraries listed below
- This is a list of resources that utilize machine learning technologies to solve image aesthetic assessment.☆43Updated 9 months ago
- (ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning☆28Updated 6 months ago
- Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning☆13Updated last year
- Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)☆20Updated 3 years ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 4 months ago
- repo for paper titled: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment (AAAI'24 Oral)☆25Updated 10 months ago
- "Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022☆69Updated 2 years ago
- Code for Motion-aware Contrastive Video Representation Learning via Foreground-background Merging (CVPR 2022)☆46Updated last year
- paper list on Video Moment Retrieval (VMR), or Natural Language Video Localization (NLVL), or Temporal Sentence Grounding in Videos (TSGV…☆31Updated 2 years ago
- ☆17Updated 8 months ago
- [ECCV'22 Poster] Explicit Image Caption Editing☆21Updated 2 years ago
- End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021☆18Updated 3 years ago
- ☆14Updated last year
- Placeholder for code of BSP.☆11Updated 3 years ago
- ☆31Updated 11 months ago
- ☆20Updated 3 years ago
- Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).☆47Updated 2 years ago
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆41Updated 2 years ago
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆50Updated 4 months ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆42Updated 2 years ago
- Source code of our CVPR2024 paper TeachCLIP for Text-to-Video Retrieval☆29Updated last month
- Source code of our MM'22 paper Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning☆21Updated 9 months ago
- Let there be clock in the beach - WACV 2022☆15Updated 3 years ago
- ☆33Updated 6 months ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆26Updated 6 months ago
- Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"☆17Updated 2 years ago
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆36Updated 3 weeks ago
- Source code of our MM'22 paper Partially Relevant Video Retrieval☆53Updated 5 months ago
- ☆29Updated last year
- [ICCV2021] Generic Event Boundary Detection: A Benchmark for Event Segmentation☆68Updated 3 years ago