Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation
☆31Mar 28, 2025Updated 11 months ago
Alternatives and similar repositories for Sparrow
Users that are interested in Sparrow are comparing it to the libraries listed below
Sorting:
- ✨✨ [ICLR 2026] MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆43Apr 10, 2025Updated 10 months ago
- ✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy☆306May 14, 2025Updated 9 months ago
- Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"☆48Sep 3, 2025Updated 6 months ago
- ✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…☆402Jan 14, 2026Updated last month
- code for promptCSE, emnlp 2022☆11Apr 10, 2023Updated 2 years ago
- ✨✨ [ICLR 2026] R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning☆281May 9, 2025Updated 9 months ago
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆368May 27, 2025Updated 9 months ago
- ✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis☆731Dec 8, 2025Updated 2 months ago
- code for COLING paper "A Hybrid Model of Classification and Generation for Spatial Relation Extraction"☆10Oct 20, 2022Updated 3 years ago
- Code for our EMNLP 2022 paper: Generative Entity Typing with Curriculum Learning.☆13Aug 19, 2023Updated 2 years ago
- ☆37Jul 9, 2024Updated last year
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13May 12, 2023Updated 2 years ago
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆10Feb 13, 2024Updated 2 years ago
- List of papers on Hallucination in LMM☆10Nov 29, 2023Updated 2 years ago
- Official code for the paper: Robust Human Detection under Visual Degradation via Thermal and mmWave Radar Fusion☆12Jul 4, 2023Updated 2 years ago
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago
- ☆10Oct 24, 2022Updated 3 years ago
- ☆10May 12, 2023Updated 2 years ago
- 大数据生态解决方案基础平台: 搜索系统、公共系统、任务管理系统、数据binlog采集、基础爬虫系统、数据传输系统、运维告警系统、APM、报表系统☆11Jan 25, 2021Updated 5 years ago
- [ACL 2023] Code and data for our paper "Measuring Progress in Fine-grained Vision-and-Language Understanding"☆13Jun 11, 2023Updated 2 years ago
- Accelerating GOT-OCRv2 with VLLM☆11Nov 15, 2024Updated last year
- ☆22Feb 3, 2026Updated last month
- Implementation of PPO for CartPole-v1☆10Jan 1, 2019Updated 7 years ago
- ☆10Nov 23, 2023Updated 2 years ago
- ☆11Jul 30, 2025Updated 7 months ago
- The codes and datasets about our ACL 2024 Main Conference paper titled "Cognitive Visual-Language Mapper: Advancing Multimodal Comprehens…☆17Jan 24, 2025Updated last year
- 基于golang go语言(beego框架)下的ONLYOFFICE Document Server二次开发。 主要功能为文档的上传、预览、覆盖、回调等功能。☆10Oct 20, 2023Updated 2 years ago
- onlyoffice 破解版,支持x86和arm64架构,arm架构切换到arm分支即可☆14Dec 16, 2022Updated 3 years ago
- ☆10Oct 17, 2022Updated 3 years ago
- ☆10Nov 7, 2018Updated 7 years ago
- DreamDance: Personalized Text-to-video Generation by Combining Text-to-Image Synthesis and Motion Transfer☆14Dec 16, 2022Updated 3 years ago
- [TIM 2025] Towards Accurate Readings of Water Meters by Eliminating Transition Error: New Dataset and Effective Solution☆13Mar 5, 2025Updated last year
- Through-Wall Human Pose Detection via Radar☆13May 9, 2022Updated 3 years ago
- Official pytorch implementation of "RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language…☆14Dec 16, 2024Updated last year
- KGML for EMNLP 2021☆10Feb 2, 2022Updated 4 years ago
- “Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition” (EMNLP 2022)☆16Feb 2, 2023Updated 3 years ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆51Jan 23, 2026Updated last month
- Official implementation of "Relational Proxies: Emergent Relationships as Fine-Grained Discriminators", NeurIPS 2022.☆14Feb 1, 2025Updated last year
- Code for [RA-L] High-Fidelity Simulated Data Generation for Real-World Zero-Shot Robotic Manipulation Learning with Gaussian Splatting☆51Feb 25, 2026Updated last week