CoreJT / NLPPapersSpiderLinks
☆10Updated 5 years ago
Alternatives and similar repositories for NLPPapersSpider
Users that are interested in NLPPapersSpider are comparing it to the libraries listed below
Sorting:
- ☆43Updated 6 months ago
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆73Updated 2 weeks ago
- a thin wrapper of chatgpt for improving paper writing.☆253Updated 2 years ago
- [MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models☆289Updated 4 months ago
- 一款便捷的抢占显卡脚本☆380Updated 10 months ago
- ChatGPT - Review & Rebuttal: A browser extension for generating reviews and rebuttals, powered by ChatGPT. 利用 ChatGPT 生成审稿意见和回复的浏览器插件☆251Updated 2 years ago
- A python implement for Certifiable Robust Multi-modal Training☆19Updated 5 months ago
- ☆35Updated 4 years ago
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆245Updated last year
- Code for "Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation"☆26Updated last year
- ☆20Updated 4 months ago
- EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…☆63Updated 6 months ago
- Update 2020☆76Updated 3 years ago
- A paper list about diffusion models for natural language processing.☆182Updated 2 years ago
- Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.☆13Updated last year
- A comprehensive overview of affective computing research in the era of large language models (LLMs).☆27Updated last year
- ☆33Updated 7 months ago
- Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models☆74Updated 4 months ago
- Code for CVPR2021 Paper “Cascaded Prediction Network via Segment Tree for Temporal Video Grounding”☆10Updated 3 years ago
- [CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos☆36Updated 10 months ago
- TCL-MAP is a powerful method for multimodal intent recognition (AAAI 2024)☆53Updated last year
- MUSIC-AVQA, CVPR2022 (ORAL)☆90Updated 2 years ago
- [ACM MM 2022 Oral] This is the official implementation of "SER30K: A Large-Scale Dataset for Sticker Emotion Recognition"☆29Updated 3 years ago
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…☆46Updated 2 years ago
- ☆31Updated last year
- Watch for idle GPUs and run your jobs: launches jobs in tmux, keeps logs/status and sends start/finish emails..☆79Updated 2 months ago
- Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆169Updated 9 months ago
- Latest Papers, Codes and Datasets on VTG-LLMs.☆55Updated last week
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆86Updated 8 months ago
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆77Updated last year