☆13Mar 28, 2025Updated last year
Alternatives and similar repositories for ST-VLM
Users that are interested in ST-VLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆33Feb 22, 2026Updated 2 months ago
- Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 20…☆18Apr 23, 2024Updated 2 years ago
- Official Implementation (Pytorch) of the "Generative Subgraph Retrieval for Knowledge Graph-Grounded Dialog Generation", EMNLP 2024 (main…☆12Mar 10, 2025Updated last year
- Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…☆25Jan 26, 2025Updated last year
- Video-Text Representation Learning via Differentiable Weak Temporal Alignment (CVPR 2022)☆18Apr 19, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models (CVPR 2023)☆35Apr 23, 2024Updated 2 years ago
- ☆22Feb 8, 2026Updated 2 months ago
- ☆10Apr 19, 2024Updated 2 years ago
- Official Implementation (Pytorch) of the "LLaMo: Large Language Model-based Molecular Graph Assistant", NeurIPS 2024☆37Feb 12, 2025Updated last year
- [ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness☆69Jul 22, 2025Updated 9 months ago
- Official Implementation (Pytorch) of "DAVI: Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems", ECCV 2024 …☆75Aug 16, 2024Updated last year
- LEO: A powerful Hybrid Multimodal LLM☆20Jan 18, 2025Updated last year
- Official implementation of paper "OED: Towards One-stage End-to-End Dynamic Scene Graph Generation".☆29Mar 26, 2024Updated 2 years ago
- Official PyTorch implementation of "Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relati…☆41Apr 19, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ACL2023] Official code repository for VLN-Trans☆14Sep 10, 2023Updated 2 years ago
- This is an official implementation of our work, Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on V…☆17Sep 24, 2025Updated 7 months ago
- ☆18Apr 10, 2025Updated last year
- [ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding☆22Mar 2, 2025Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆21Jul 20, 2024Updated last year
- KV cache compression via sparse coding☆17Oct 26, 2025Updated 6 months ago
- ☆22Jun 6, 2024Updated last year
- [CVPR 2024] GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding☆18Jun 10, 2024Updated last year
- ☆25Mar 30, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A simple taxonomic tree format using indented plain text☆14Jun 28, 2025Updated 10 months ago
- ☆26Apr 26, 2025Updated last year
- ☆13May 15, 2025Updated 11 months ago
- Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)☆77Mar 26, 2025Updated last year
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆84Jul 4, 2025Updated 10 months ago
- ☆26Mar 26, 2025Updated last year
- ☆28Jul 23, 2025Updated 9 months ago
- Code and Data for "GenAI Arena: An Open Evaluation Platform for Generative Models" [NeurIPS 2024]☆35Sep 8, 2024Updated last year
- ☆31Mar 5, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ICLR 2025] Dataset and Code for Paper "Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels"☆44Dec 23, 2025Updated 4 months ago
- [ICCV 2025] SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs☆85Jan 17, 2026Updated 3 months ago
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆43Dec 16, 2025Updated 4 months ago
- Code for our paper "Category Query Learning for Human-Object Interaction Classification" (CVPR2023)☆37Jul 9, 2023Updated 2 years ago
- [CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆128Apr 7, 2026Updated 3 weeks ago
- ☆20Apr 17, 2025Updated last year
- IROS☆18Aug 10, 2025Updated 8 months ago