Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
β34Jun 17, 2024Updated last year
Alternatives and similar repositories for LangRepo
Users that are interested in LangRepo are comparing it to the libraries listed below
Sorting:
- π€ [ICLR'25] Multimodal Video Understanding Framework (MVU)β56Jan 31, 2025Updated last year
- Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videosβ28Oct 27, 2025Updated 4 months ago
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddingsβ11Feb 24, 2025Updated last year
- [WIP] Code for LangToMoβ20Jun 25, 2025Updated 8 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"β27Jan 17, 2026Updated last month
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)β37Jan 1, 2024Updated 2 years ago
- Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"β106Oct 27, 2024Updated last year
- β18Dec 17, 2022Updated 3 years ago
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] ποΈ LVNet.β42Feb 10, 2026Updated 3 weeks ago
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"β20Apr 20, 2023Updated 2 years ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires pythonβ₯3.5β13Feb 16, 2026Updated 3 weeks ago
- This repository contains the implementation for our work "TopoDiffusionNet: A Topology-aware Diffusion Model", accepted to ICLR 2025.β21Apr 17, 2025Updated 10 months ago
- Environments for Active Vision Reinforcement Learningβ28Oct 10, 2024Updated last year
- Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"β154Jun 23, 2025Updated 8 months ago
- [ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learningβ70Aug 4, 2024Updated last year
- β17Oct 22, 2024Updated last year
- ICDE 2023 Paper, GAR: A Generate-and-Rank Approach for Natural Language to SQL Translationβ19Sep 19, 2023Updated 2 years ago
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)β301Dec 5, 2024Updated last year
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".β55Oct 21, 2025Updated 4 months ago
- WACV 2024: "PathLDM: Text conditioned Latent Diffusion Model for Histopathology"β48Jul 7, 2024Updated last year
- AI Multi-agent system for real-time, adaptive supply chain coordination and optimization leveraging responsive AI clusters.β35Mar 28, 2024Updated last year
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Modelsβ89Feb 6, 2026Updated last month
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policyβ227Mar 29, 2025Updated 11 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)β57Jul 25, 2023Updated 2 years ago
- SUPERVAIZER is a toolkit built for the age of AI interoperability. At its core, it implements Google's Agent-to-Agent (A2A) protocol, enaβ¦β14Feb 4, 2026Updated last month
- [EMNLP-2022 Findings] Code for paper βProGen: Progressive Zero-shot Dataset Generation via In-context Feedbackβ.β27Feb 4, 2023Updated 3 years ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)β185Aug 2, 2025Updated 7 months ago
- Demos of some issues with LangChain.β31Jul 14, 2023Updated 2 years ago
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervisionβ72Jul 10, 2024Updated last year
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answeringβ196Jan 14, 2024Updated 2 years ago
- Sotopia-Ο: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)β81May 7, 2024Updated last year
- β17Sep 1, 2024Updated last year
- β24Feb 4, 2026Updated last month
- Self-hosted GPT-4V apiβ27Nov 6, 2023Updated 2 years ago
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning withβ¦β36Jan 31, 2026Updated last month
- Long Context Transfer from Language to Visionβ402Mar 18, 2025Updated 11 months ago
- [CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understandingβ688Jan 29, 2025Updated last year
- β138Sep 29, 2024Updated last year
- [CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".β294Jun 13, 2024Updated last year