nku-shengzheliu / SER30KLinks

[ACM MM 2022 Oral] This is the official implementation of "SER30K: A Large-Scale Dataset for Sticker Emotion Recognition"

☆28

Alternatives and similar repositories for SER30K

Users that are interested in SER30K are comparing it to the libraries listed below

Sorting:

yuezih / Movie101
Narrative movie understanding benchmark
☆76Updated 5 months ago
AlignGPT-VL / AlignGPT
Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"
☆34Updated last year
ZhangYiqun018 / StickerConv
☆59Updated last year
xiangyu-mm / EasyGen
The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"
☆73Updated last year
RUC-AIMind / TikTalk
☆70Updated 5 months ago
Exploring-Embodied-Emotion-official / E3
☆19Updated 4 months ago
WangFei-2019 / Image-text-Retrieval
☆46Updated 3 years ago
HenryHZY / VL-PET
[ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"
☆52Updated 2 years ago
Lilidamowang / T2VIndexer-generativeSearch
☆12Updated last year
X-PLUG / mPLUG
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
☆96Updated 2 years ago
foundation-multimodal-models / CAL
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
☆57Updated last year
360CVGroup / Inner-Adaptor-Architecture
LMM solved catastrophic forgetting, AAAI2025
☆44Updated 7 months ago
junyangwang0410 / HaELM
An automatic MLLM hallucination detection framework
☆19Updated 2 years ago
ChenDelong1999 / polite-flamingo
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
☆64Updated last year
bighuang624 / VoP
[CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
☆38Updated 2 years ago
SihengLi99 / TextBind
[2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation
☆47Updated 2 years ago
thunlp / Muffin
☆66Updated last year
patrick-tssn / VSTAR
[ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information
☆15Updated last year
palchenli / VL-Instruction-Tuning
☆91Updated last year
FudanDISC / ReForm-Eval
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
☆45Updated 2 years ago
BetterZH / SEVLM-code
Training A Small Emotional Vision Language Model for Visual Art Comprehension
☆15Updated last year
QQ-MM / PureMM
☆21Updated last year
findalexli / mllm-dpo
[ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model
☆48Updated last year
zengyan-97 / CCLM
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training (ACL 2023))
☆92Updated 2 years ago
TXH-mercury / COSA
[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
☆43Updated 10 months ago
AGI-Edgerunners / IIL
Code for our Paper "All in an Aggregated Image for In-Image Learning"
☆29Updated last year
llyx97 / FETV
[NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…
☆56Updated last year
X-PLUG / mPLUG-HalOwl
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
☆98Updated last year
PLUM-Lab / MultiInstruct
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
☆133Updated 2 years ago
Liuziyu77 / MMDU
Official repository of MMDU dataset
☆97Updated last year