HumanMLLM / Omni-EmotionLinks

☆21

Alternatives and similar repositories for Omni-Emotion

Users that are interested in Omni-Emotion are comparing it to the libraries listed below

Sorting:

RainBowLuoCS / OpenOmni
(NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…
☆107Updated last month
rikeilong / Bay-CAT
[ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenario…
☆57Updated last year
HarryHsing / EchoInk
EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…
☆60Updated 5 months ago
Exploring-Embodied-Emotion-official / E3
☆18Updated 4 months ago
BriansIDP / video-SALMONN-o1
☆35Updated 2 months ago
GeWu-Lab / Crab
[CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
☆73Updated 4 months ago
threegold116 / Awesome-Omni-MLLMs
This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels
☆60Updated last month
zeroQiaoba / gpt4v-emotion
GPT-4V with Emotion
☆95Updated last year
schowdhury671 / meerkat
☆33Updated 3 months ago
zeroQiaoba / AffectGPT
Explainable Multimodal Emotion Reasoning (EMER), OV-MER （ICML), and AffectGPT （ICML, Oral)
☆271Updated 2 months ago
emova-ollm / EMOVA
Official PyTorch implementation of EMOVA in CVPR 2025 (https://arxiv.org/abs/2409.18042)
☆74Updated 7 months ago
24DavidHuang / Emotion-Qwen
Welcome to the official repository of Emotion-Qwen.
☆20Updated 4 months ago
ttgeng233 / LongVALE
LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos. (CVPR 2025))
☆51Updated 4 months ago
HumanMLLM / HumanOmni
HumanOmni
☆201Updated 7 months ago
aimmemotion / EmoVIT
[CVPR 2024] EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning
☆36Updated 6 months ago
fuyyyyy / SEPM
[ICML'25 Spotlight] Catch Your Emotion: Sharpening Emotion Perception in Multimodal Large Language Models
☆37Updated last month
thuiar / MIntRec2.0
MIntRec2.0 is the first large-scale dataset for multimodal intent recognition and out-of-scope detection in multi-party conversations (IC…
☆63Updated 2 months ago
sunlicai / HiCMAE
[Information Fusion 2024] HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition
☆114Updated 2 months ago
MC-EIU / MC-EIU
☆24Updated 6 months ago
ttgeng233 / UniAV
Unified Audio-Visual Perception for Multi-Task Video Localization
☆28Updated last year
multimodal-art-projection / OmniBench
A project for tri-modal LLM benchmarking and instruction tuning.
☆48Updated 7 months ago
thuiar / TCL-MAP
TCL-MAP is a powerful method for multimodal intent recognition (AAAI 2024)
☆51Updated last year
GeWu-Lab / TSPM
Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
☆17Updated last year
yannqi / COMBO-AVS
[CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…
☆39Updated 6 months ago
chengzju / CARAT
☆22Updated 6 months ago
the-anonymous-bs / av-SALMONN
av-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
☆13Updated last year
haoyi-duan / DG-SCT
NeurIPS'2023 official implementation code
☆66Updated last year
GalaxyCong / EmoDubber
Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.
☆29Updated 4 months ago
yan9qu / EmoLLM
EmoLLM: Multimodal Emotional Understanding Meets Large Language Models
☆19Updated last year
GeWu-Lab / LFAV
Towards Long Form Audio-visual Video Understanding
☆15Updated 6 months ago