Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
β36Jun 17, 2024Updated last year
Alternatives and similar repositories for LangRepo
Users that are interested in LangRepo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ [ICLR'25] Multimodal Video Understanding Framework (MVU)β57Jan 31, 2025Updated last year
- Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videosβ28Oct 27, 2025Updated 5 months ago
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddingsβ11Feb 24, 2025Updated last year
- [WIP] Code for LangToMoβ20Mar 19, 2026Updated last week
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)β37Jan 1, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodimentβ25Jan 9, 2025Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"β27Mar 13, 2026Updated 2 weeks ago
- Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"β106Oct 27, 2024Updated last year
- β14Jun 25, 2022Updated 3 years ago
- β18Dec 17, 2022Updated 3 years ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires pythonβ₯3.5β13Mar 17, 2026Updated last week
- Code for the paper Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformersβ21Aug 2, 2024Updated last year
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"β20Apr 20, 2023Updated 2 years ago
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] ποΈ LVNet.β42Feb 10, 2026Updated last month
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learningβ70Aug 4, 2024Updated last year
- Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"β158Jun 23, 2025Updated 9 months ago
- This repository contains the implementation for our work "TopoDiffusionNet: A Topology-aware Diffusion Model", accepted to ICLR 2025.β22Apr 17, 2025Updated 11 months ago
- β142Apr 16, 2025Updated 11 months ago
- Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)β15Jul 4, 2022Updated 3 years ago
- Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentorsβ31Jun 2, 2024Updated last year
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Modelsβ90Feb 6, 2026Updated last month
- Official code of *Towards Event-oriented Long Video Understanding*β12Jul 26, 2024Updated last year
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)β108Jun 26, 2024Updated last year
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"β170Nov 5, 2024Updated last year
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)β304Dec 5, 2024Updated last year
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".β55Oct 21, 2025Updated 5 months ago
- β109Dec 30, 2024Updated last year
- An experiment with movie scenes and contrastive learningβ11Feb 1, 2025Updated last year
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)β188Aug 2, 2025Updated 7 months ago
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semanticsβ38Sep 10, 2025Updated 6 months ago
- β206Jul 12, 2024Updated last year
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answeringβ198Jan 14, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learningβ14Apr 25, 2024Updated last year
- β138Sep 29, 2024Updated last year
- Official Repository of "Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads"β17Oct 6, 2025Updated 5 months ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"β19Jan 18, 2026Updated 2 months ago
- β107Jul 30, 2024Updated last year
- β11Sep 1, 2020Updated 5 years ago
- Theia: Distilling Diverse Vision Foundation Models for Robot Learningβ271Nov 6, 2025Updated 4 months ago