xjtupanda/Sparrow

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xjtupanda/Sparrow)

xjtupanda / Sparrow

Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"

☆48

Alternatives and similar repositories for Sparrow

Users that are interested in Sparrow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zyzhangUstc / MELLM
View on GitHub
☆40Mar 18, 2026Updated 4 months ago
VITA-MLLM / Sparrow
View on GitHub
Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation
☆32Mar 28, 2025Updated last year
zhourax / VEGA
View on GitHub
☆38Jul 9, 2024Updated 2 years ago
MAC-AutoML / QuoTA
View on GitHub
✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…
☆79Apr 28, 2025Updated last year
CSLiJT / HCD-code
View on GitHub
Official code of HierCDF @ SIGKDD2022
☆12Aug 14, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MME-Benchmarks / Video-MME
View on GitHub
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
☆788Dec 8, 2025Updated 7 months ago
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
SCZwangxiao / video-ReTaKe
View on GitHub
Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
☆40Mar 16, 2025Updated last year
SUSTechBruce / LOOK-M
View on GitHub
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆103Nov 9, 2024Updated last year
THUNLP-MT / Brote
View on GitHub
☆11Jan 19, 2025Updated last year
Hon-Wong / ByteVideoLLM
View on GitHub
[ICCV 2025] Dynamic-VLM
☆28Dec 16, 2024Updated last year
ggg0919 / cantor
View on GitHub
☆90May 10, 2024Updated 2 years ago
Leon1207 / 3DRefTR
View on GitHub
This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"
☆26Aug 24, 2023Updated 2 years ago
inFaaa / Evolver
View on GitHub
[COLING 2025🔥] Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
☆17Jan 21, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Share14 / ShareGemini
View on GitHub
☆32Jul 29, 2024Updated last year
adobe-research / llava-score
View on GitHub
☆11Oct 2, 2024Updated last year
ziqipang / MR-Video
View on GitHub
MR. Video: MapReduce is the Principle for Long Video Understanding
☆31Jun 18, 2026Updated last month
BradyFU / DVG-Face
View on GitHub
[TPAMI 2021] DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition
☆76Nov 13, 2023Updated 2 years ago
MME-Benchmarks / Video-MME-v2
View on GitHub
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
☆369May 24, 2026Updated 2 months ago
qiujihao19 / Artemis
View on GitHub
[NeurIPS 2024] Artemis: Towards Referential Understanding in Complex Videos
☆27Apr 8, 2025Updated last year
haoyu-bu / CAFe
View on GitHub
Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"
☆33Mar 26, 2025Updated last year
VITA-MLLM / Long-VITA
View on GitHub
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
☆305May 14, 2025Updated last year
FreedomIntelligence / LongLLaVA
View on GitHub
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
☆211Jan 6, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Tencent / LUCY
View on GitHub
The official implement of "LUCY: Linguistic Understanding and Control Yielding Early Stage of Her"
☆16Jul 10, 2025Updated last year
CASIA-IVA-Lab / VideoNIAH
View on GitHub
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆57Mar 9, 2025Updated last year
shenyunhang / APE
View on GitHub
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
☆608May 8, 2024Updated 2 years ago
TimeMarker-LLM / TimeMarker
View on GitHub
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
☆107Nov 28, 2024Updated last year
DCDmllm / Momentor
View on GitHub
☆81Nov 24, 2024Updated last year
RUCAIBox / Event-Bench
View on GitHub
Official code of *Towards Event-oriented Long Video Understanding*
☆12Jul 26, 2024Updated 2 years ago
Kwai-YuanQi / MM-RLHF
View on GitHub
The Next Step Forward in Multimodal LLM Alignment
☆198May 1, 2025Updated last year
GitHubOfHyl97 / SkeAttnCLR
View on GitHub
The Official PyTorch implementation of "Part Aware Contrastive Learning for Self-Supervised Action Recognition" in IJCAI 2023
☆13Nov 9, 2023Updated 2 years ago
Tencent / Freeze-Omni
View on GitHub
The official implement of Freeze-Omni.
☆16Jul 10, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
PKU-YuanGroup / SwapAnyone
View on GitHub
An official implementation of SwapAnyone.
☆77Mar 14, 2025Updated last year
ruili33 / TPO
View on GitHub
☆41Sep 9, 2025Updated 10 months ago
farewellthree / BT-Adapter
View on GitHub
[CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"
☆35Feb 2, 2024Updated 2 years ago
alanaai / EVUD
View on GitHub
Egocentric Video Understanding Dataset (EVUD)
☆34Jul 4, 2024Updated 2 years ago
alibaba / conv-llava
View on GitHub
☆128Jul 29, 2024Updated last year
duyichao / MINETrans-IWSLT23
View on GitHub
Official implementation of our IWSLT 2023 paper "The MineTrans Systems for IWSLT 2023 Offline Speech Translation and Speech-to-Speech Tra…
☆16Jul 14, 2023Updated 3 years ago
sosppxo / 3D-STMN
View on GitHub
[AAAI 2024] The official implementation of the paper "3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Refer…
☆45Dec 20, 2023Updated 2 years ago