CoreJT / NLPPapersSpiderLinks
☆10Updated 6 years ago
Alternatives and similar repositories for NLPPapersSpider
Users that are interested in NLPPapersSpider are comparing it to the libraries listed below
Sorting:
- a thin wrapper of chatgpt for improving paper writing.☆253Updated 2 years ago
- ☆43Updated 8 months ago
- ☆35Updated 4 years ago
- Code for "Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation"☆26Updated last year
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆89Updated last month
- ☆20Updated 7 months ago
- Update 2020☆75Updated 3 years ago
- [MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models☆290Updated 6 months ago
- The source code for "UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All"☆49Updated last year
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆38Updated 2 weeks ago
- Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.☆13Updated last year
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆247Updated 2 years ago
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…☆46Updated 2 years ago
- EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…☆73Updated 8 months ago
- 一款便捷的抢占显卡脚本☆393Updated last month
- ☆155Updated 8 months ago
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆63Updated 10 months ago
- [ACM MM 2022 Oral] This is the official implementation of "SER30K: A Large-Scale Dataset for Sticker Emotion Recognition"☆29Updated 3 years ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Updated 2 years ago
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆100Updated 10 months ago
- Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆178Updated 11 months ago
- A comprehensive overview of affective computing research in the era of large language models (LLMs).☆30Updated last year
- Latest Papers, Codes and Datasets on VTG-LLMs.☆79Updated 2 months ago
- Learning Situation Hyper-Graphs for Video Question Answering☆22Updated last year
- Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models☆77Updated 6 months ago
- Code for CVPR2021 Paper “Cascaded Prediction Network via Segment Tree for Temporal Video Grounding”☆10Updated 3 years ago
- 和李沐一起读论文☆217Updated 7 months ago
- ChatGPT - Review & Rebuttal: A browser extension for generating reviews and rebuttals, powered by ChatGPT. 利用 ChatGPT 生成审稿意见和回复的浏览器插件☆251Updated 2 years ago
- Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"☆20Updated 3 months ago
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"☆34Updated last year