Hon-Wong/Elysium

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Hon-Wong/Elysium)

Hon-Wong / Elysium

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

☆89

Alternatives and similar repositories for Elysium

Users that are interested in Elysium are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TRI-ML / VOST
View on GitHub
Code for the VOST dataset
☆26Oct 1, 2023Updated 2 years ago
mbzuai-oryx / TrackingMeetsLMM
View on GitHub
☆10Apr 7, 2025Updated last year
HengLan / VastTrack
View on GitHub
[NeurIPS 2024] VastTrack: Vast Category Visual Object Tracking
☆77Sep 30, 2025Updated 9 months ago
Vision-CAIR / LongVU
View on GitHub
[ICML 2025] Official PyTorch implementation of LongVU
☆431May 8, 2025Updated last year
whwu95 / FreeVA
View on GitHub
FreeVA: Offline MLLM as Training-Free Video Assistant
☆69Jun 9, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Nathan-Li123 / LaMOT
View on GitHub
[ICRA 2025] LaMOT: Language-Guided Multi-Object Tracking
☆30Feb 10, 2025Updated last year
lscpku / VITATECS
View on GitHub
☆18Jul 10, 2024Updated 2 years ago
RobertLuo1 / NeurIPS2023_SOC
View on GitHub
[NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
☆33Mar 16, 2024Updated 2 years ago
Coo1Sea / OVT-B-Dataset
View on GitHub
[NeurIPS 2024] Repository for the paper "OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking".
☆28Nov 9, 2024Updated last year
TimeMarker-LLM / TimeMarker
View on GitHub
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
☆107Nov 28, 2024Updated last year
Hon-Wong / ByteVideoLLM
View on GitHub
[ICCV 2025] Dynamic-VLM
☆28Dec 16, 2024Updated last year
mrwu-mac / R-Bench
View on GitHub
[ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'
☆24Jan 1, 2025Updated last year
OpenGVLab / VideoChat-Flash
View on GitHub
[ICLR2026] VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
☆526Jul 19, 2026Updated last week
jinxiang-liu / UFE-AVS
View on GitHub
Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""
☆19Jul 7, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
alanaai / EVUD
View on GitHub
Egocentric Video Understanding Dataset (EVUD)
☆34Jul 4, 2024Updated 2 years ago
jefferyZhan / Griffon
View on GitHub
Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1(CVPR 2026).
☆250Apr 17, 2026Updated 3 months ago
yunlong10 / CAT-V
View on GitHub
[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…
☆68Jan 27, 2026Updated 6 months ago
jinxiang-liu / anno-free-AVS
View on GitHub
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
☆38Oct 11, 2024Updated last year
yuecao0119 / MMInstruct
View on GitHub
[SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…
☆64Nov 7, 2024Updated last year
sukjunhwang / set_classifier
View on GitHub
Official PyTorch implementation of: "Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in V…
☆14Aug 29, 2022Updated 3 years ago
twelvelabs-io / pegasus-1-eval
View on GitHub
Repository for evaluating Pegasus-1 and video-language foundation models
☆14Nov 12, 2024Updated last year
VectorSpaceLab / Video-XL
View on GitHub
🔥🔥First-ever hour scale video understanding models
☆626Jul 14, 2025Updated last year
yellow-binary-tree / HawkEye
View on GitHub
Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos
☆47Apr 29, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
xuefeng-zhu5 / UniMod1K
View on GitHub
The dataset and codes of the paper UniMod1K: Towards a More Universal Large-Scale Dataset and Benchmark for Multi-Modal Learning.
☆17Sep 21, 2025Updated 10 months ago
Espere-1119-Song / Video-MMLU
View on GitHub
A Massive Multi-Discipline Lecture Understanding Benchmark
☆34Apr 20, 2026Updated 3 months ago
bytedance / tarsier
View on GitHub
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with g…
☆548Aug 14, 2025Updated 11 months ago
Tencent-QQMM / Video-CCAM
View on GitHub
A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.
☆74Oct 14, 2024Updated last year
CASIA-IVA-Lab / VideoNIAH
View on GitHub
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆57Mar 9, 2025Updated last year
zhouyiks / CoLVA
View on GitHub
☆44Jul 9, 2025Updated last year
Event-AHU / Open_VLTrack
View on GitHub
Vision-Language based Visual Object Tracking
☆36Jul 4, 2026Updated 3 weeks ago
longvideobench / LongVideoBench
View on GitHub
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆134Jul 27, 2024Updated 2 years ago
DwanZhang-AI / SePPO
View on GitHub
Code for "SePPO: Semi-Policy Preference Optimization for Diffusion Alignment."
☆18Oct 7, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
bigai-nlco / VideoLLaMB
View on GitHub
[ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
☆87Feb 27, 2025Updated last year
RenShuhuai-Andy / NBP
View on GitHub
Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
☆42Feb 12, 2025Updated last year
llyx97 / TempCompass
View on GitHub
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …
☆133Apr 4, 2025Updated last year
janghyuncho / DECOLA
View on GitHub
Code release for "Language-conditioned Detection Transformer"
☆86Jun 17, 2024Updated 2 years ago
yale-nlp / TOMATO
View on GitHub
☆42Nov 8, 2024Updated last year
WHB139426 / Grounded-Video-LLM
View on GitHub
[EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆149Aug 21, 2025Updated 11 months ago
YangLiu14 / Open-World-Tracking
View on GitHub
Official code for "Opening up Open World Tracking" (CVPR 2022)
☆56Apr 8, 2023Updated 3 years ago