Code implementation for the paper "Large-scale Pre-training for Grounded Video Caption Generation" (ICCV 2025)
☆29Jan 18, 2026Updated 2 months ago
Alternatives and similar repositories for grove
Users that are interested in grove are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Support library for the MaskRCNN masks extracted on EPIC-KITCHENS-100☆14Dec 1, 2020Updated 5 years ago
- Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch☆20Dec 16, 2021Updated 4 years ago
- Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024☆58Aug 19, 2025Updated 7 months ago
- Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation (ECCV 2024 ORAL)☆16Sep 3, 2024Updated last year
- A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction (TIP2021)☆13Jul 7, 2022Updated 3 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Whisper finetuning☆16Apr 9, 2025Updated 11 months ago
- Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…☆24Jan 26, 2025Updated last year
- Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"☆54Nov 7, 2024Updated last year
- [CVPR 2023] OCTET: Object-aware Counterfactual Explanations☆19Dec 12, 2024Updated last year
- ☆44Jul 28, 2025Updated 7 months ago
- [CVPR2025] Official code for Lost in Translation Found in Context☆23Jan 14, 2026Updated 2 months ago
- ☆11Nov 17, 2018Updated 7 years ago
- ☆10Oct 24, 2024Updated last year
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆65Jan 27, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆46Apr 29, 2024Updated last year
- Official code for the paper "Understanding Co-speech Gestures in-the-wild"☆20Oct 31, 2025Updated 4 months ago
- ☆18Mar 8, 2023Updated 3 years ago
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆30Feb 28, 2026Updated 3 weeks ago
- ☆67Sep 13, 2022Updated 3 years ago
- Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)☆21Jul 16, 2025Updated 8 months ago
- Unsupervised Multi-object Segmentation by Predicting Probable Motion Patterns☆17Nov 15, 2022Updated 3 years ago
- Combined InstantID🔥 and FouriScale to generate high resolution image!☆11Apr 3, 2024Updated last year
- (ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆59Jan 26, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated last year
- Speech2Action CVPR Poster Source Code☆20Apr 29, 2020Updated 5 years ago
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆64Mar 9, 2022Updated 4 years ago
- ☆14Dec 25, 2024Updated last year
- Splits for epic-sounds dataset☆86Aug 2, 2025Updated 7 months ago
- VoxSRC2022 workshop development kit☆19Jul 21, 2022Updated 3 years ago
- Code for "SePPO: Semi-Policy Preference Optimization for Diffusion Alignment."☆18Oct 7, 2024Updated last year
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆16Mar 18, 2026Updated last week
- ☆55Jul 16, 2025Updated 8 months ago
- ☆15Dec 2, 2025Updated 3 months ago
- [🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …☆28Nov 1, 2025Updated 4 months ago
- ☆11Dec 6, 2024Updated last year
- [CVPRW'23] The official PyTorch implementation of NamedMask☆23Jun 12, 2023Updated 2 years ago
- [ECCV 2024] The first zero-shot setting for spatio-temporal video grounding.☆11Jul 16, 2024Updated last year