This repo holds the implementation of PAVE: Patching and Adapting Video Large Language Models (CVPR2025)
☆27Sep 6, 2025Updated 6 months ago
Alternatives and similar repositories for PAVE
Users that are interested in PAVE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.☆12Oct 15, 2021Updated 4 years ago
- ☆17Dec 23, 2022Updated 3 years ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆55Oct 9, 2025Updated 5 months ago
- Dataset of measurements from a low-cost single-photon camera used in our CVPR 2024 paper "Towards 3D Vision with Low-Cost Single-Photon C…☆13Nov 24, 2025Updated 4 months ago
- Learning Continuous Grasping Function with a Dexterous Hand from Human Demonstrations, RA-L 2023☆22Aug 28, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Code release for DeepEDM (ICML 2025)☆28Jan 20, 2026Updated 2 months ago
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 3 months ago
- code for paper "Physical-World Optical Adversarial Attacks on 3D Face Recognition"☆20Oct 19, 2023Updated 2 years ago
- [CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"☆56Aug 8, 2023Updated 2 years ago
- ☆12Nov 16, 2020Updated 5 years ago
- fork from https://github.com/jwyang/faster-rcnn.pytorch☆10Aug 6, 2018Updated 7 years ago
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆17Feb 13, 2025Updated last year
- Official Implementation of Video-MA2MBA☆12Dec 3, 2024Updated last year
- The implementation of "A Simple Baseline for Weakly-Supervised Scene Graph Generation" for ICCV2021☆15Aug 17, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆56Jul 1, 2025Updated 8 months ago
- [ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges☆84Feb 27, 2025Updated last year
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 9 months ago
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated last month
- A Fast PyTorch implementation for ICCV 19 paper "BMN: Boundary-Matching Network for Temporal Action Proposal Generation"☆10Jul 29, 2019Updated 6 years ago
- [ICML 2024 Oral] LSH-Based Efficient Point Transformer (HEPT)☆24Jan 24, 2025Updated last year
- Official Implementation of SnAG (CVPR 2024)☆57Apr 26, 2025Updated 11 months ago
- [ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"☆100Apr 4, 2023Updated 2 years ago
- [CVPR2024] Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset☆60Jun 25, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆31Feb 22, 2026Updated last month
- Streaming Video Instruction Tuning☆62Feb 25, 2026Updated last month
- ☆28Apr 8, 2025Updated 11 months ago
- [ECCV 2020] Official code for "Comprehensive Image Captioning via Scene Graph Decomposition"☆99Aug 20, 2024Updated last year
- Demonstration of using Caffe2 inside an Android application.☆10Dec 23, 2018Updated 7 years ago
- PyTorch speech2text inference script for the NVidia openseq2seq wav2letter model variant☆10Aug 12, 2019Updated 6 years ago
- ☆30Jan 18, 2026Updated 2 months ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Aug 5, 2024Updated last year
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for our paper "Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers"☆37Jan 27, 2026Updated last month
- Implementation of FixMatch in PyTorch and experimentations☆12Aug 9, 2020Updated 5 years ago
- MediaPipeのFaceMesh検出を用いて、虹彩部分に写輪眼(©NARUTO -ナルト-)を表示するプログラム☆11Apr 16, 2022Updated 3 years ago
- Code of Deno-IF: Unsupervised Noisy Visible and Infrared Image Fusion Method (NeurIPS 2025)☆24Dec 27, 2025Updated 2 months ago
- ☆26Aug 22, 2025Updated 7 months ago
- ☆10Apr 7, 2025Updated 11 months ago
- ☆16Mar 10, 2020Updated 6 years ago