Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
β36Jun 17, 2024Updated last year
Alternatives and similar repositories for LangRepo
Users that are interested in LangRepo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π€ [ICLR'25] Multimodal Video Understanding Framework (MVU)β56Jan 31, 2025Updated last year
- Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videosβ28Oct 27, 2025Updated 6 months ago
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddingsβ11Feb 24, 2025Updated last year
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodimentβ25Jan 9, 2025Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"β27Apr 13, 2026Updated 3 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- β14Jun 25, 2022Updated 3 years ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires pythonβ₯3.5β13Apr 29, 2026Updated last week
- Code for the paper Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformersβ21Aug 2, 2024Updated last year
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"β20Apr 20, 2023Updated 3 years ago
- Agentic Keyframe Search for Video Question Answeringβ18Apr 7, 2025Updated last year
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] ποΈ LVNet.β43Feb 10, 2026Updated 2 months ago
- [ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learningβ72Aug 4, 2024Updated last year
- Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"β159Jun 23, 2025Updated 10 months ago
- Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"β54Oct 10, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- β147Apr 16, 2025Updated last year
- Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentorsβ31Jun 2, 2024Updated last year
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Modelsβ90Feb 6, 2026Updated 3 months ago
- Official code of *Towards Event-oriented Long Video Understanding*β12Jul 26, 2024Updated last year
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)β108Jun 26, 2024Updated last year
- Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"β171Nov 5, 2024Updated last year
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)β307Dec 5, 2024Updated last year
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policyβ229Mar 29, 2025Updated last year
- WACV 2024: "PathLDM: Text conditioned Latent Diffusion Model for Histopathology"β49Jul 7, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".β55Oct 21, 2025Updated 6 months ago
- β112Dec 30, 2024Updated last year
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)β190Aug 2, 2025Updated 9 months ago
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semanticsβ37Sep 10, 2025Updated 7 months ago
- β208Jul 12, 2024Updated last year
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answeringβ198Jan 14, 2024Updated 2 years ago
- Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learningβ15Apr 25, 2024Updated 2 years ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"β19Jan 18, 2026Updated 3 months ago
- β107Jul 30, 2024Updated last year
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Theia: Distilling Diverse Vision Foundation Models for Robot Learningβ272Nov 6, 2025Updated 6 months ago
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answeringβ18Oct 31, 2024Updated last year
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusionβ56Jul 1, 2025Updated 10 months ago
- β20Mar 10, 2025Updated last year
- β46Apr 4, 2026Updated last month
- [CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".β296Jun 13, 2024Updated last year
- mcp wrapper for openai built-in toolsβ12Mar 13, 2025Updated last year