Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
β36Jun 17, 2024Updated last year
Alternatives and similar repositories for LangRepo
Users that are interested in LangRepo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ [ICLR'25] Multimodal Video Understanding Framework (MVU)β56Jan 31, 2025Updated last year
- Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videosβ28Oct 27, 2025Updated 5 months ago
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddingsβ11Feb 24, 2025Updated last year
- [WIP] Code for LangToMoβ20Mar 19, 2026Updated 3 weeks ago
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)β37Jan 1, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"β27Mar 13, 2026Updated last month
- Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"β105Oct 27, 2024Updated last year
- β14Jun 25, 2022Updated 3 years ago
- β18Dec 17, 2022Updated 3 years ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires pythonβ₯3.5β13Mar 17, 2026Updated last month
- Code for the paper Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformersβ21Aug 2, 2024Updated last year
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"β20Apr 20, 2023Updated 2 years ago
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] ποΈ LVNet.β43Feb 10, 2026Updated 2 months ago
- [ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learningβ71Aug 4, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"β54Oct 10, 2024Updated last year
- This repository contains the implementation for our work "TopoDiffusionNet: A Topology-aware Diffusion Model", accepted to ICLR 2025.β22Apr 17, 2025Updated last year
- Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)β15Jul 4, 2022Updated 3 years ago
- β145Apr 16, 2025Updated last year
- Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentorsβ31Jun 2, 2024Updated last year
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)β108Jun 26, 2024Updated last year
- Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"β171Nov 5, 2024Updated last year
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)β305Dec 5, 2024Updated last year
- [ECCV2022] [T-PAMI] StARformer: Transformer with State-Action-Reward Representations.β96May 21, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".β55Oct 21, 2025Updated 5 months ago
- β111Dec 30, 2024Updated last year
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)β57Jul 25, 2023Updated 2 years ago
- An experiment with movie scenes and contrastive learningβ11Feb 1, 2025Updated last year
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)β189Aug 2, 2025Updated 8 months ago
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semanticsβ38Sep 10, 2025Updated 7 months ago
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answeringβ198Jan 14, 2024Updated 2 years ago
- Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learningβ14Apr 25, 2024Updated last year
- Official Repository of "Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads"β17Oct 6, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- β137Sep 29, 2024Updated last year
- CLiC: Concept Learning in Contextβ10Jan 24, 2025Updated last year
- β107Jul 30, 2024Updated last year
- Theia: Distilling Diverse Vision Foundation Models for Robot Learningβ273Nov 6, 2025Updated 5 months ago
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answeringβ17Oct 31, 2024Updated last year
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusionβ56Jul 1, 2025Updated 9 months ago
- β20Mar 10, 2025Updated last year