☆96Sep 19, 2024Updated last year
Alternatives and similar repositories for fineVideo
Users that are interested in fineVideo are comparing it to the libraries listed below
Sorting:
- A huge dataset for Document Visual Question Answering☆20Jul 29, 2024Updated last year
- Video-LlaVA fine-tune for CinePile evaluation☆51Aug 8, 2024Updated last year
- YOLOv10: Real-Time End-to-End Object Detection☆12May 24, 2024Updated last year
- Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals☆12May 24, 2024Updated last year
- ☆10Nov 18, 2024Updated last year
- ☆33Jul 9, 2025Updated 7 months ago
- Hugging Face Jobs☆19Jul 11, 2025Updated 7 months ago
- ☆16Jul 8, 2024Updated last year
- ☆83May 6, 2025Updated 9 months ago
- A list of podcast URLs scraped from the Apple podcast database in late 2021, including a script for downloading those podcasts.☆43Mar 9, 2022Updated 3 years ago
- Official PyTorch Implementation of Opt-CWM: Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals.☆22Mar 27, 2025Updated 11 months ago
- ☆20Nov 18, 2024Updated last year
- ☆22Jun 30, 2021Updated 4 years ago
- Quick exploration into fine tuning florence 2☆338Sep 19, 2024Updated last year
- What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness☆26May 16, 2025Updated 9 months ago
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆52Dec 28, 2025Updated 2 months ago
- Official implementation of EgoHOD at ICLR 2025; 14 EgoVis Challenge Winners in CVPR 2024☆32Nov 25, 2025Updated 3 months ago
- Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency☆60Jun 6, 2025Updated 8 months ago
- [ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning☆159Aug 8, 2025Updated 6 months ago
- Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)☆19Mar 9, 2024Updated last year
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆183Sep 26, 2025Updated 5 months ago
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆166Mar 23, 2025Updated 11 months ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- Python library for building and running distributed data pipelines using Ray☆54Dec 16, 2025Updated 2 months ago
- [ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…☆28Jan 28, 2025Updated last year
- Profile your CoreML models directly from Python 🐍☆30Sep 8, 2025Updated 5 months ago
- Demo for 2022 ICASSP☆64Jun 14, 2022Updated 3 years ago
- Use one line code to call SadTalker API with modelscope☆24Nov 18, 2023Updated 2 years ago
- [CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding☆686Jan 29, 2025Updated last year
- ☆107Jul 30, 2024Updated last year
- [ICCV 2025] LVBench: An Extreme Long Video Understanding Benchmark☆137Jul 9, 2025Updated 7 months ago
- 👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)☆74Jan 20, 2025Updated last year
- Awesome papers & datasets specifically focused on long-term videos.☆355Oct 9, 2025Updated 4 months ago
- ☆32Jul 29, 2024Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Dec 28, 2023Updated 2 years ago
- working on parallel wavenet☆25Apr 19, 2018Updated 7 years ago
- A curated list of awesome DUST3R/MAST3R related papers.☆34Aug 5, 2025Updated 6 months ago
- Scaling Vision Pre-Training to 4K Resolution☆221Jan 4, 2026Updated last month
- [DEPRECIATED] [PyTorch 2.0] [638M] [85.33% acc] Full-attention multi-instrumental music transformer for supervised music generation, opti…☆32Nov 23, 2023Updated 2 years ago