gyxxyg / TRACE
[ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling
☆63Updated last week
Alternatives and similar repositories for TRACE:
Users that are interested in TRACE are comparing it to the libraries listed below
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆36Updated 9 months ago
- Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.☆56Updated 4 months ago
- VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆46Updated 2 weeks ago
- ☆63Updated 2 months ago
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding☆86Updated last month
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆63Updated 6 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆54Updated 7 months ago
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆19Updated 3 weeks ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆45Updated 7 months ago
- Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆84Updated last month
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆29Updated 11 months ago
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆44Updated 9 months ago
- The official repository for paper "PruneVid: Visual Token Pruning for Efficient Video Large Language Models".☆28Updated 2 weeks ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆29Updated 10 months ago
- Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models☆75Updated last month
- ☆63Updated 2 months ago
- The official implementation of RAR☆79Updated 10 months ago
- A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability☆78Updated 2 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆33Updated 3 months ago
- official impelmentation of Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input☆62Updated 5 months ago
- ☆37Updated 9 months ago
- 【NeurIPS 2024】Dense Connector for MLLMs☆154Updated 3 months ago
- This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…☆17Updated 8 months ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆48Updated 7 months ago
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆62Updated 7 months ago
- [Preprint] Number it: Temporal Grounding Videos like Flipping Manga☆54Updated 2 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆44Updated last year
- [Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.☆84Updated 6 months ago
- [ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …☆97Updated 2 months ago
- Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"☆48Updated this week