LongVideoHaystack / TStarLinks

☆53

Alternatives and similar repositories for TStar

Users that are interested in TStar are comparing it to the libraries listed below

Sorting:

appletea233 / Temporal-R1
Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency
☆40Updated 2 weeks ago
EvolvingLMMs-Lab / VideoMMMU
☆49Updated 2 months ago
yaolinli / TimeChat-Online
TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆50Updated this week
JoeLeelyf / OVO-Bench
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆65Updated 2 months ago
hmxiong / StreamChat
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆55Updated 3 months ago
hshjerry / VideoEspresso
[CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
☆87Updated 2 weeks ago
Becomebright / ReKV
Official PyTorch Code of ReKV (ICLR'25)
☆28Updated 3 months ago
yu-rp / VisualPerceptionToken
☆86Updated 3 months ago
TencentARC / Video-Holmes
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
☆51Updated 3 weeks ago
TencentARC / SEED-Bench-R1
☆84Updated 2 months ago
egolife-ai / Ego-R1
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning
☆70Updated this week
TIGER-AI-Lab / Pixel-Reasoner
Pixel-Level Reasoning Model trained with RL
☆140Updated last week
llyx97 / TempCompass
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …
☆117Updated 2 months ago
z-x-yang / DoraemonGPT
Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
☆85Updated 9 months ago
yonseivnl / vlm-rlaif
ACL'24 (Oral) Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback
☆64Updated 9 months ago
zhengrongz / AoTD
[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".
☆33Updated last month
OpenGVLab / VideoChat-R1
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning
☆152Updated 2 weeks ago
rese1f / aurora
[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
☆111Updated 3 weeks ago
Ziyang412 / VideoTree
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆119Updated 3 months ago
shiyi-zh0408 / NAE_CVPR2024
Accepted by CVPR 2024
☆33Updated last year
WHB139426 / Grounded-Video-LLM
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆116Updated 3 months ago
Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆58Updated 5 months ago
bigai-nlco / VideoLLaMB
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
☆69Updated 3 months ago
longvideobench / LongVideoBench
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆97Updated 10 months ago
GuangyanS / Sys2-LLaVA
☆24Updated 4 months ago
Yxxxb / VoCo-LLaMA
[CVPR'2025] VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".
☆169Updated last week
TencentARC / TokLIP
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
☆71Updated 2 weeks ago
Visual-AI / PruneVid
The official repository for ACL2025 paper "PruneVid: Visual Token Pruning for Efficient Video Large Language Models".
☆46Updated last month
ZhangXJ199 / TinyLLaVA-Video-R1
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
☆74Updated last month
SCZwangxiao / video-ReTaKe
Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
☆34Updated 3 months ago