☆13Mar 28, 2025Updated 11 months ago
Alternatives and similar repositories for ST-VLM
Users that are interested in ST-VLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆31Feb 22, 2026Updated last month
- Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 20…☆18Apr 23, 2024Updated last year
- Official Implementation (Pytorch) of the "Generative Subgraph Retrieval for Knowledge Graph-Grounded Dialog Generation", EMNLP 2024 (main…☆12Mar 10, 2025Updated last year
- Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…☆24Jan 26, 2025Updated last year
- Video-Text Representation Learning via Differentiable Weak Temporal Alignment (CVPR 2022)☆17Apr 19, 2024Updated last year
- MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models (CVPR 2023)☆35Apr 23, 2024Updated last year
- ☆23Feb 8, 2026Updated last month
- ☆10Apr 19, 2024Updated last year
- Official Implementation (Pytorch) of the "LLaMo: Large Language Model-based Molecular Graph Assistant", NeurIPS 2024☆36Feb 12, 2025Updated last year
- [ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness☆67Jul 22, 2025Updated 8 months ago
- Official Implementation (Pytorch) of "DAVI: Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems", ECCV 2024 …☆74Aug 16, 2024Updated last year
- LEO: A powerful Hybrid Multimodal LLM☆20Jan 18, 2025Updated last year
- Official implementation of paper "OED: Towards One-stage End-to-End Dynamic Scene Graph Generation".☆26Mar 26, 2024Updated last year
- Official PyTorch implementation of "Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relati…☆41Apr 19, 2024Updated last year
- [ACL2023] Official code repository for VLN-Trans☆14Sep 10, 2023Updated 2 years ago
- This is an official implementation of our work, Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on V…☆16Sep 24, 2025Updated 6 months ago
- ☆18Apr 10, 2025Updated 11 months ago
- [ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding☆19Mar 2, 2025Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆19Jul 20, 2024Updated last year
- KV cache compression via sparse coding☆17Oct 26, 2025Updated 4 months ago
- ☆25Mar 30, 2025Updated 11 months ago
- ☆22Jun 6, 2024Updated last year
- [CVPR 2024] GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding☆18Jun 10, 2024Updated last year
- A simple taxonomic tree format using indented plain text☆13Jun 28, 2025Updated 8 months ago
- ☆26Apr 26, 2025Updated 10 months ago
- ☆13May 15, 2025Updated 10 months ago
- Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)☆78Mar 26, 2025Updated 11 months ago
- ☆26Mar 26, 2025Updated 11 months ago
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆83Jul 4, 2025Updated 8 months ago
- ☆27Jul 23, 2025Updated 8 months ago
- Code and Data for "GenAI Arena: An Open Evaluation Platform for Generative Models" [NeurIPS 2024]☆35Sep 8, 2024Updated last year
- ☆31Mar 5, 2025Updated last year
- [ICLR 2025] Dataset and Code for Paper "Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels"☆45Dec 23, 2025Updated 3 months ago
- [ICCV 2025] SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs☆82Jan 17, 2026Updated 2 months ago
- [CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆121Feb 25, 2026Updated 3 weeks ago
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆42Dec 16, 2025Updated 3 months ago
- Code for our paper "Category Query Learning for Human-Object Interaction Classification" (CVPR2023)☆37Jul 9, 2023Updated 2 years ago
- ☆20Apr 17, 2025Updated 11 months ago
- ☆49Sep 13, 2024Updated last year