Lzq5/UniTime

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Lzq5/UniTime)

Lzq5 / UniTime

Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

☆56

Alternatives and similar repositories for UniTime

Users that are interested in UniTime are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhengrongz / AoTD
View on GitHub
[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".
☆58Updated this week
qirui-chen / RGA3-release
View on GitHub
[ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring
☆24Aug 8, 2025Updated 11 months ago
iLearn-Lab / TPAMI26-Awesome-MLLMs-for-Video-Temporal-Grounding
View on GitHub
Latest Papers, Codes and Datasets on VTG-LLMs.
☆95Jul 12, 2026Updated last week
Tanveer81 / ReVisionLLM
View on GitHub
This is the official implementation of ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
☆47Nov 5, 2025Updated 8 months ago
zjuruizhechen / TVG-R1
View on GitHub
[EMNLP 2025 Industry] Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning
☆36Oct 22, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
qirui-chen / MultiHop-EgoQA
View on GitHub
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
☆38May 27, 2025Updated last year
Lzq5 / Video-Text-Alignment
View on GitHub
☆28Jul 18, 2025Updated last year
Go2Heart / StreamFormer
View on GitHub
[ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.
☆93Updated this week
Becomebright / GroundVQA
View on GitHub
Official PyTorch code of GroundVQA (CVPR'24)
☆63Sep 13, 2024Updated last year
haoningwu3639 / MRGen
View on GitHub
[ICCV 2025] MRGen: Segmentation Data Engine for Underrepresented MRI Modalities
☆41Sep 26, 2025Updated 9 months ago
minghangz / OnVTG
View on GitHub
Online video temporal grounding
☆16Oct 20, 2025Updated 9 months ago
haoningwu3639 / SimpleSDM-Video
View on GitHub
A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.
☆20Feb 15, 2024Updated 2 years ago
jbistanbul / universalvtg
View on GitHub
Official Code for the paper "UniversalVTG: A Univeral and Lightweight Foundation Model for Video Temporal Grounding"
☆15Apr 15, 2026Updated 3 months ago
TencentARC / TimeLens
View on GitHub
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
☆162Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jyrao / MatchTime
View on GitHub
[EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation
☆104Jan 2, 2025Updated last year
zhang9302002 / ThinkingWithVideos
View on GitHub
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆102Oct 15, 2025Updated 9 months ago
Tangkfan / Awesome-Temporal-Video-Grounding
View on GitHub
paper list on Video Moment Retrieval (VMR), or Temporal Video Grounding (TVG), Video Grounding (VG), or Temporal Sentence Grounding in Vi…
☆43Dec 27, 2025Updated 6 months ago
haolinyang-hlyang / SoccerMaster
View on GitHub
[CVPR 2026 Oral] SoccerMaster: A Vision Foundation Model for Soccer Understanding
☆67Jul 14, 2026Updated last week
ZijiaLewisLu / CVPR2025-DeCafNet
View on GitHub
Official Repo for CVPR 2025 Paper -- DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos
☆17Mar 16, 2026Updated 4 months ago
Code-kunkun / ZS-CIR
View on GitHub
[BMVC 2023] Zero-shot Composed Text-Image Retrieval
☆55Nov 26, 2024Updated last year
xiaomi-research / time-r1
View on GitHub
[NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
☆95Dec 14, 2025Updated 7 months ago
Tanveer81 / RGNet
View on GitHub
This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos
☆20Mar 3, 2025Updated last year
gyxxyg / TRACE
View on GitHub
[ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling
☆156Aug 22, 2025Updated 11 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
haoningwu3639 / SpatialScore
View on GitHub
[CVPR 2026 Highlight] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
☆84May 28, 2026Updated last month
appletea233 / LLaVA-ST
View on GitHub
[CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
☆84Jul 4, 2025Updated last year
minghangz / TFVTG
View on GitHub
☆57Sep 13, 2024Updated last year
haoningwu3639 / SimpleSDM-3
View on GitHub
A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.
☆27May 28, 2025Updated last year
OpenGVLab / VideoChat-R1
View on GitHub
[NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning
☆268Oct 18, 2025Updated 9 months ago
yongliang-wu / NumPro
View on GitHub
[CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga
☆150Jan 19, 2026Updated 6 months ago
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆122Nov 4, 2025Updated 8 months ago
MAGIC-AI4Med / RaTEScore
View on GitHub
[EMNLP 2024] RaTEScore: A Metric for Radiology Report Generation
☆67May 18, 2025Updated last year
THUNLP-MT / MUSEG
View on GitHub
Repo for paper "MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding".
☆40Jun 9, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
EvolvingLMMs-Lab / ParaVT
View on GitHub
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
☆54Jun 2, 2026Updated last month
nusnlp / d2vlm
View on GitHub
[ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models
☆24Apr 18, 2026Updated 3 months ago
jinxiang-liu / UFE-AVS
View on GitHub
Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""
☆19Jul 7, 2024Updated 2 years ago
JPShi12 / VideoLoom
View on GitHub
[ICML 2026] VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding
☆27Jul 3, 2026Updated 3 weeks ago
Code-kunkun / LamRA
View on GitHub
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
☆182Jul 7, 2025Updated last year
ninatu / howtocaption
View on GitHub
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
☆59Aug 19, 2025Updated 11 months ago
Becomebright / MTV
View on GitHub
Revisiting Multi-Task Visual Representation Learning
☆22Jan 21, 2026Updated 6 months ago