VITA-MLLM/Sparrow

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VITA-MLLM/Sparrow)

VITA-MLLM / Sparrow

Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation

☆32

Alternatives and similar repositories for Sparrow

Users that are interested in Sparrow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MME-Benchmarks / MME-Unify
View on GitHub
✨✨ [ICLR 2026] MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
☆42Apr 10, 2025Updated last year
Northern-byte-bit / SpeechParaling-Bench
View on GitHub
☆30May 21, 2026Updated 2 months ago
Kwai-YuanQi / MM-RLHF
View on GitHub
The Next Step Forward in Multimodal LLM Alignment
☆198May 1, 2025Updated last year
MAC-AutoML / QuoTA
View on GitHub
✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…
☆79Apr 28, 2025Updated last year
xjtupanda / Sparrow
View on GitHub
Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"
☆48Sep 3, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
VITA-MLLM / Long-VITA
View on GitHub
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
☆305May 14, 2025Updated last year
MiG-NJU / PersonaVLM
View on GitHub
[CVPR 2026 Highlight] PersonaVLM: Long-Term Personalized Multimodal LLMs
☆112Apr 16, 2026Updated 3 months ago
yangruoliu / VideoDetective
View on GitHub
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
☆58May 1, 2026Updated 2 months ago
VITA-MLLM / Omni-Diffusion
View on GitHub
✨✨[ICML 2026] Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
☆153Mar 12, 2026Updated 4 months ago
Leon1207 / Video-RAG-master
View on GitHub
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…
☆446Jun 26, 2026Updated 3 weeks ago
MME-Benchmarks / Video-MME-v2
View on GitHub
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
☆369May 24, 2026Updated 2 months ago
MiG-NJU / EvoEmbedding
View on GitHub
EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory
☆52Updated this week
real-ljt / NAMEdataset
View on GitHub
(ACM MM 2025) NaME: A Natural Micro-expression Dataset for Micro-expression Recognition in the Wild
☆18Mar 22, 2026Updated 4 months ago
VITA-MLLM / Freeze-Omni
View on GitHub
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
☆388May 27, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
MME-Benchmarks / Video-MME
View on GitHub
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
☆787Dec 8, 2025Updated 7 months ago
FrankYang-17 / MME-VideoOCR
View on GitHub
☆40May 28, 2025Updated last year
CSLiJT / HCD-code
View on GitHub
Official code of HierCDF @ SIGKDD2022
☆12Aug 14, 2022Updated 3 years ago
lemon0830 / promptCSE
View on GitHub
code for promptCSE, emnlp 2022
☆11Apr 10, 2023Updated 3 years ago
ChoS3nE11ven / Agentic-MME
View on GitHub
☆36Apr 13, 2026Updated 3 months ago
BradyFU / DVG-Face
View on GitHub
[TPAMI 2021] DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition
☆76Nov 13, 2023Updated 2 years ago
Shaokang-Agent / Awesome-Reinforcement-Learning-Papers
View on GitHub
📚 List of Top-tier Conference Papers on Reinforcement Learning (RL)，including: NeurIPS, AAAI, IJCAI, ICML, AAMAS, ICLR, ICRA, etc. | （AI…
☆11Aug 20, 2023Updated 2 years ago
baiyuliang / Qrobot_Vue
View on GitHub
Vue版小Q聊天机器人
☆14Jan 6, 2023Updated 3 years ago
MAC-AutoML / ITPruner
View on GitHub
☆28Dec 7, 2021Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
VITA-MLLM / VITA
View on GitHub
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
☆2,521Mar 28, 2025Updated last year
VITA-MLLM / Woodpecker
View on GitHub
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
☆649Dec 23, 2024Updated last year
Minhchuyentoancbn / Data-Mining-and-Machine-Learning-Projects
View on GitHub
Implementation-based Projects in Data Mining and Machine Learning
☆22Apr 3, 2023Updated 3 years ago
Shaokang-Agent / DCVTD
View on GitHub
Implementation of the paper "Decentralized Counterfactual Value with Threat Detection for Multi-Agent Reinforcement Learning in Mixed Coo…
☆17Dec 7, 2024Updated last year
zhourax / VEGA
View on GitHub
☆38Jul 9, 2024Updated 2 years ago
Shaokang-Agent / WToE
View on GitHub
Implementation of the paper "WToE: Learning When to Explore in Multi-Agent Reinforcement Learning"
☆21Aug 17, 2024Updated last year
John-Ge / Awesome-Native-Multimodal-Models
View on GitHub
☆35Apr 9, 2025Updated last year
Shaokang-Agent / D-F
View on GitHub
Implementation of the paper "Egoism, Utilitarianism and Egalitarianism in Multi-Agent Reinforcement Learning"
☆21Aug 17, 2024Updated last year
Shaokang-Agent / S2L
View on GitHub
Implementation of the paper "Multi-Agent Exploration via Self-Learning and Social Learning"
☆20Dec 7, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
FrankYang-17 / RealUnify
View on GitHub
☆27Oct 10, 2025Updated 9 months ago
ChenyuHeidiZhang / VL-commonsense
View on GitHub
☆14May 23, 2022Updated 4 years ago
xjtupanda / AUW-GCN
View on GitHub
[ICME-2023] Official Pytorch implementation of AU-aware graph convolutional network for Macro- and Micro-expression spotting
☆36Mar 21, 2025Updated last year
zehanwang01 / OmniBind
View on GitHub
☆34Apr 11, 2025Updated last year
mikecheninoulu / Emotional-gesture-papers
View on GitHub
☆23May 29, 2025Updated last year
MME-Benchmarks / MME-RealWorld
View on GitHub
✨✨ [ICLR 2025] MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
☆161Oct 21, 2025Updated 9 months ago
NROwind / OpenGPT-4o-Image
View on GitHub
A Comprehensive Dataset for Advanced Image Generation and Editing}
☆33Oct 2, 2025Updated 9 months ago