Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
β36Jun 17, 2024Updated last year
Alternatives and similar repositories for LangRepo
Users that are interested in LangRepo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ [ICLR'25] Multimodal Video Understanding Framework (MVU)β56Jan 31, 2025Updated last year
- Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videosβ30Oct 27, 2025Updated 7 months ago
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddingsβ11Feb 24, 2025Updated last year
- [WIP] Code for LangToMoβ21Mar 19, 2026Updated 2 months ago
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodimentβ25Jan 9, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"β27May 11, 2026Updated 2 weeks ago
- Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"β105Oct 27, 2024Updated last year
- β14Jun 25, 2022Updated 3 years ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires pythonβ₯3.5β13May 21, 2026Updated last week
- Code for the paper Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformersβ21Aug 2, 2024Updated last year
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"β20Apr 20, 2023Updated 3 years ago
- Agentic Keyframe Search for Video Question Answeringβ18Apr 7, 2025Updated last year
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] ποΈ LVNet.β43Feb 10, 2026Updated 3 months ago
- [ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learningβ72Aug 4, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"β162Jun 23, 2025Updated 11 months ago
- Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"β54Oct 10, 2024Updated last year
- This repository contains the implementation for our work "TopoDiffusionNet: A Topology-aware Diffusion Model", accepted to ICLR 2025.β25Apr 17, 2025Updated last year
- Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentorsβ31Jun 2, 2024Updated last year
- Official code of *Towards Event-oriented Long Video Understanding*β12Jul 26, 2024Updated last year
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)β108Jun 26, 2024Updated last year
- Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"β171Nov 5, 2024Updated last year
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)β311Dec 5, 2024Updated last year
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policyβ229Mar 29, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- WACV 2024: "PathLDM: Text conditioned Latent Diffusion Model for Histopathology"β50Jul 7, 2024Updated last year
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".β55Oct 21, 2025Updated 7 months ago
- β114Dec 30, 2024Updated last year
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)β57Jul 25, 2023Updated 2 years ago
- An experiment with movie scenes and contrastive learningβ11Feb 1, 2025Updated last year
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semanticsβ37Sep 10, 2025Updated 8 months ago
- β208Jul 12, 2024Updated last year
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answeringβ198Jan 14, 2024Updated 2 years ago
- Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learningβ15Apr 25, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"β19Jan 18, 2026Updated 4 months ago
- β138Sep 29, 2024Updated last year
- β11Sep 1, 2020Updated 5 years ago
- β108Jul 30, 2024Updated last year
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answeringβ18Oct 31, 2024Updated last year
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusionβ56Jul 1, 2025Updated 10 months ago
- β20Mar 10, 2025Updated last year