kkahatapitiya / LangRepoView external linksLinks
Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
β34Jun 17, 2024Updated last year
Alternatives and similar repositories for LangRepo
Users that are interested in LangRepo are comparing it to the libraries listed below
Sorting:
- π€ [ICLR'25] Multimodal Video Understanding Framework (MVU)β55Jan 31, 2025Updated last year
- Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videosβ28Oct 27, 2025Updated 3 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"β27Jan 17, 2026Updated 3 weeks ago
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodimentβ23Jan 9, 2025Updated last year
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)β37Jan 1, 2024Updated 2 years ago
- Code for the paper Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformersβ21Aug 2, 2024Updated last year
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] ποΈ LVNet.β42Updated this week
- Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"β154Jun 23, 2025Updated 7 months ago
- Agentic Keyframe Search for Video Question Answeringβ16Apr 7, 2025Updated 10 months ago
- β14Jun 25, 2022Updated 3 years ago
- Official code of *Towards Event-oriented Long Video Understanding*β12Jul 26, 2024Updated last year
- [ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learningβ70Aug 4, 2024Updated last year
- Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"β54Oct 10, 2024Updated last year
- Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)β15Jul 4, 2022Updated 3 years ago
- β134Apr 16, 2025Updated 10 months ago
- ICDE 2023 Paper, GAR: A Generate-and-Rank Approach for Natural Language to SQL Translationβ19Sep 19, 2023Updated 2 years ago
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)β108Jun 26, 2024Updated last year
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)β297Dec 5, 2024Updated last year
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".β54Oct 21, 2025Updated 3 months ago
- WACV 2024: "PathLDM: Text conditioned Latent Diffusion Model for Histopathology"β48Jul 7, 2024Updated last year
- AI Multi-agent system for real-time, adaptive supply chain coordination and optimization leveraging responsive AI clusters.β35Mar 28, 2024Updated last year
- β203Jul 12, 2024Updated last year
- This is the offical repository of LLAVIDALβ23Oct 4, 2025Updated 4 months ago
- [ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policyβ227Mar 29, 2025Updated 10 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)β57Jul 25, 2023Updated 2 years ago
- β109Dec 30, 2024Updated last year
- (2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understandingβ345Jul 19, 2024Updated last year
- [EMNLP-2022 Findings] Code for paper βProGen: Progressive Zero-shot Dataset Generation via In-context Feedbackβ.β27Feb 4, 2023Updated 3 years ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)β184Aug 2, 2025Updated 6 months ago
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semanticsβ38Sep 10, 2025Updated 5 months ago
- Demos of some issues with LangChain.β31Jul 14, 2023Updated 2 years ago
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answeringβ195Jan 14, 2024Updated 2 years ago
- β17Sep 1, 2024Updated last year
- Sotopia-Ο: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)β81May 7, 2024Updated last year
- Self-hosted GPT-4V apiβ27Nov 6, 2023Updated 2 years ago
- Long Context Transfer from Language to Visionβ400Mar 18, 2025Updated 10 months ago
- [CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understandingβ684Jan 29, 2025Updated last year
- Official code for TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representationsβ36Jan 24, 2026Updated 3 weeks ago
- β138Sep 29, 2024Updated last year