lucas-ventura / chapter-llama
Official PyTorch implementation of the paper "Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs"
☆50Updated this week
Alternatives and similar repositories for chapter-llama:
Users that are interested in chapter-llama are comparing it to the libraries listed below
- Code for "Don’t drop your samples! Coherence-aware training benefits Conditional diffusion" CVPR 2024 Highlight☆51Updated last month
- Reliability in Semantic Segmentation: Can We Use Synthetic Data? (ECCV 2024)☆35Updated 8 months ago
- Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".☆104Updated this week
- Code for "Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation"☆67Updated 3 months ago
- Code for "How far can we go with ImageNet for Text-to-Image generation?" paper☆82Updated 3 weeks ago
- official implementation of the Polynomial Mixer☆18Updated last month
- ☆68Updated 5 months ago
- Code for the paper "AMEGO: Active Memory from long EGOcentric videos" published at ECCV 2024☆38Updated 4 months ago
- High order Moment Models☆38Updated this week
- (CVPRW 2022) Learning Co-segmentation by Segment Swapping for Retrieval and Discovery☆53Updated 2 years ago
- Toolbox for the Earth Parser Dataset, a dataset presented in the "Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans" pape…☆26Updated last year
- This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …☆85Updated 10 months ago
- Code for the paper "GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos" published at CVPR 2024☆51Updated last year
- FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024☆20Updated 4 months ago
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆26Updated 7 months ago
- (3DV 2021 oral) PyTorch implementation of paper "PoseContrast: Class-Agnostic Object Viewpoint Estimation in the Wild with Pose-Aware Con…☆44Updated last year
- ☆15Updated last year
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"☆26Updated 11 months ago
- ☆59Updated last year
- Diffusion Models as Data Mining Tools☆53Updated last month
- [ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…☆25Updated 2 months ago
- Data release for Step Differences in Instructional Video (CVPR24)☆13Updated 9 months ago
- ☆88Updated 3 months ago
- ☆23Updated 5 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆37Updated 9 months ago
- [CVPR 2024 Highlight] SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers☆63Updated 9 months ago
- [NeurIPS2023] 3D-OWIS is capable of detecting unknown instances in inference, and progressively learning novel classes in the process of …☆67Updated last year
- [ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding☆42Updated 2 years ago
- Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"☆73Updated 8 months ago
- [ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation☆40Updated 6 months ago