scofield7419 / EmpathyEarLinks
Multimodal Empathetic Chatbot
☆40Updated 10 months ago
Alternatives and similar repositories for EmpathyEar
Users that are interested in EmpathyEar are comparing it to the libraries listed below
Sorting:
- GPT-4V with Emotion☆93Updated last year
- ☆15Updated 11 months ago
- HumanOmni☆169Updated 2 months ago
- a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.☆17Updated 2 months ago
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆26Updated 4 months ago
- ☆20Updated 4 months ago
- OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Rea…☆53Updated this week
- [ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenario…☆53Updated 9 months ago
- [ACL2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information☆12Updated 7 months ago
- [ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …☆115Updated 2 months ago
- ☆46Updated last month
- [ACM MM 2022 Oral] This is the official implementation of "SER30K: A Large-Scale Dataset for Sticker Emotion Recognition"☆24Updated 2 years ago
- [CVPR 2024] EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning☆31Updated last month
- [ACM ICMR'25]Official repository for "eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos"☆33Updated 11 months ago
- LMM solved catastrophic forgetting, AAAI2025☆43Updated last month
- Explainable Multimodal Emotion Reasoning (EMER), Open-vocabulary MER (OV-MER), and AffectGPT☆182Updated 2 weeks ago
- ☆15Updated 2 years ago
- Official implementation of paper AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding☆62Updated last month
- Narrative movie understanding benchmark☆71Updated last year
- ☆87Updated 9 months ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆48Updated 10 months ago
- ☆49Updated 11 months ago
- Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))☆45Updated 10 months ago
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆45Updated 4 months ago
- Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges☆68Updated 3 months ago
- EmoLLM: Multimodal Emotional Understanding Meets Large Language Models☆14Updated 11 months ago
- ☆21Updated last month
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆82Updated last year
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆33Updated last week
- Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…☆31Updated 4 months ago