CoreJT / NLPPapersSpiderLinks
☆10Updated 6 years ago
Alternatives and similar repositories for NLPPapersSpider
Users that are interested in NLPPapersSpider are comparing it to the libraries listed below
Sorting:
- ☆43Updated 8 months ago
- Update 2020☆75Updated 3 years ago
- Code for "Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation"☆26Updated last year
- ☆20Updated 7 months ago
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆89Updated last month
- a thin wrapper of chatgpt for improving paper writing.☆253Updated 2 years ago
- UniSA: Unified Generative Framework for Sentiment Analysis☆51Updated last year
- [CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos☆37Updated last year
- TCL-MAP is a powerful method for multimodal intent recognition (AAAI 2024)☆56Updated 2 years ago
- Pytorch implementation for Tailor Versatile Multi-modal Learning for Multi-label Emotion Recognition☆65Updated 3 years ago
- A comprehensive overview of affective computing research in the era of large language models (LLMs).☆30Updated last year
- Learning Situation Hyper-Graphs for Video Question Answering☆22Updated last year
- ☆35Updated 4 years ago
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆77Updated 2 years ago
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆63Updated 10 months ago
- The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)☆308Updated 4 months ago
- ChatGPT - Review & Rebuttal: A browser extension for generating reviews and rebuttals, powered by ChatGPT. 利用 ChatGPT 生成审稿意见和回复的浏览器插件☆251Updated 2 years ago
- [MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models☆290Updated 6 months ago
- ☆17Updated 2 years ago
- Vision Transformers are Parameter-Efficient Audio-Visual Learners☆106Updated 2 years ago
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆247Updated 2 years ago
- ☆46Updated 4 years ago
- Code for CVPR2021 Paper “Cascaded Prediction Network via Segment Tree for Temporal Video Grounding”☆10Updated 3 years ago
- Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.☆13Updated last year
- [Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations☆153Updated last year
- MIntRec2.0 is the first large-scale dataset for multimodal intent recognition and out-of-scope detection in multi-party conversations (IC…☆71Updated 5 months ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents☆317Updated last year
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆143Updated last year
- Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.☆16Updated last year
- [ACM MM 2022 Oral] This is the official implementation of "SER30K: A Large-Scale Dataset for Sticker Emotion Recognition"☆29Updated 3 years ago