☆60Aug 10, 2023Updated 2 years ago
Alternatives and similar repositories for video-localized-narratives
Users that are interested in video-localized-narratives are comparing it to the libraries listed below
Sorting:
- ☆13Jul 20, 2024Updated last year
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 2 years ago
- ☆53Oct 16, 2023Updated 2 years ago
- ☆58Apr 24, 2024Updated last year
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆22Nov 8, 2023Updated 2 years ago
- SSL Video Representation Learning project☆14Jul 8, 2025Updated 8 months ago
- Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learning☆14Apr 25, 2024Updated last year
- [NeurIPS 2023] Official repository for "Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models"☆11Jun 18, 2024Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- Implementation of Pix2Seq in PyTorch☆10Feb 3, 2022Updated 4 years ago
- This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts…☆292Feb 12, 2024Updated 2 years ago
- [ICCV 2023 Workshop] The Official Implementation of The First Prize Solution for RVOS Competition☆14Jan 1, 2024Updated 2 years ago
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Jun 15, 2023Updated 2 years ago
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆33Nov 7, 2023Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- ☆19Jan 30, 2023Updated 3 years ago
- Official This-Is-My Dataset published in CVPR 2023☆16Jul 18, 2024Updated last year
- Syphus: Automatic Instruction-Response Generation Pipeline☆14Dec 14, 2023Updated 2 years ago
- ☆41Sep 25, 2023Updated 2 years ago
- ☆43Jul 9, 2025Updated 7 months ago
- Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".☆71Feb 28, 2024Updated 2 years ago
- Official repo for "DynaMITe: Dynamic Query Bootstrapping for Multi-object Interactive Segmentation Transformer"☆19Sep 29, 2023Updated 2 years ago
- Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation (TIP 2024, ACM MM 2023)☆19Mar 13, 2024Updated last year
- ☆16May 26, 2023Updated 2 years ago
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Jan 18, 2024Updated 2 years ago
- 【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition☆38Apr 27, 2024Updated last year
- ☆58Dec 2, 2025Updated 3 months ago
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Nov 29, 2023Updated 2 years ago
- ☆242Jun 4, 2025Updated 9 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆46Mar 29, 2024Updated last year
- ☆42Jan 22, 2024Updated 2 years ago
- ☆46Jan 11, 2024Updated 2 years ago
- ☆17Jun 15, 2022Updated 3 years ago
- Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time☆46Jun 11, 2024Updated last year
- ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation (CVPR'25)☆20Apr 2, 2025Updated 11 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆27Jan 17, 2026Updated last month
- The official repo for the technical report "Scalable Mask Annotation for Video Text Spotting"☆16May 3, 2023Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Nov 13, 2023Updated 2 years ago
- ☆49Nov 12, 2022Updated 3 years ago