NickyFot / EmoCommonSenseLinks
Official Repository for VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning
☆24Updated last year
Alternatives and similar repositories for EmoCommonSense
Users that are interested in EmoCommonSense are comparing it to the libraries listed below
Sorting:
- ☆119Updated last year
- [ICLR 2024] FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition☆91Updated 9 months ago
- [ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling☆134Updated 2 months ago
- [CVPR 2024] TeachCLIP for Text-to-Video Retrieval☆40Updated 6 months ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆61Updated last year
- Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"☆64Updated 2 years ago
- [ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption☆101Updated 2 years ago
- [AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.☆44Updated last year
- ☆53Updated last year
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆24Updated 9 months ago
- ☆84Updated 2 years ago
- ☆68Updated last year
- 🌀 R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)☆90Updated last year
- Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"☆107Updated last year
- Composed Video Retrieval☆61Updated last year
- Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"☆17Updated 2 years ago
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆66Updated last year
- ☆30Updated 2 years ago
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆35Updated last year
- This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)☆29Updated last year
- 👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)☆69Updated 9 months ago
- MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)☆30Updated 2 years ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆125Updated 2 years ago
- [ICCV 2025] Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning.☆45Updated last month
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".☆52Updated 2 weeks ago
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆54Updated 11 months ago
- ☆80Updated 11 months ago
- Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)☆17Updated last year
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆53Updated last year
- Disentangled Pre-training for Human-Object Interaction Detection☆26Updated last month