Supercharged BLIP-2 that can handle videos
☆124Dec 1, 2023Updated 2 years ago
Alternatives and similar repositories for VideoBLIP
Users that are interested in VideoBLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties☆133Nov 10, 2024Updated last year
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 3 years ago
- A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it☆142Jan 22, 2024Updated 2 years ago
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 3 years ago
- ☆17Jul 30, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆39Dec 4, 2023Updated 2 years ago
- The HD-VG-130M Dataset☆126Apr 8, 2024Updated 2 years ago
- ☆13Jul 20, 2024Updated last year
- Official Repo for Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation☆30Mar 29, 2024Updated 2 years ago
- Official Code for DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents (Findings of EMNL…☆22Oct 24, 2023Updated 2 years ago
- ☆17Dec 13, 2023Updated 2 years ago
- team Doggeee's solution to Ego4D LTA challenge@CVPRW23'☆14Nov 4, 2023Updated 2 years ago
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Jan 18, 2024Updated 2 years ago
- List of papers on Hallucination in LMM☆10Nov 29, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for EMNLP 2022 Paper DANLI: Deliberative Agent for Following Natural Language Instructions☆18May 1, 2025Updated last year
- Official code for the ACL 2021 Findings paper "Yichi Zhang and Joyce Chai. Hierarchical Task Learning from Language Instructions with Uni…☆24Jun 28, 2021Updated 4 years ago
- An automatic MLLM hallucination detection framework☆19Sep 26, 2023Updated 2 years ago
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- FlexiFilm: Long Video Generation with Flexible Conditions☆31May 1, 2024Updated 2 years ago
- mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)☆227Jul 21, 2023Updated 2 years ago
- [EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding☆3,143Jun 4, 2024Updated 2 years ago
- ☆19Sep 19, 2024Updated last year
- Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.☆19May 7, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models☆348May 27, 2024Updated 2 years ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- "From ViT Features to Training-free Video Object Segmentation via Streaming-data Mixture Models" [Uziel, Dinari, and Freifeld, NeurIPS 20…☆13Jan 16, 2024Updated 2 years ago
- Code release for "Learning Video Representations from Large Language Models"☆533Oct 1, 2023Updated 2 years ago
- The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".☆12Oct 17, 2023Updated 2 years ago
- ☆26Feb 17, 2026Updated 3 months ago
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆33Nov 7, 2023Updated 2 years ago
- LTX-Video-Trainer-GUI 是为LTX视频lora模型训练提供的GUI工具,支持通过简单的界面训练 LoRA 模型用于视频生成。本训练器提供了直观的 GUI 界面,使用户能够轻松设置和启动训练流程,无需编写复杂代码。☆13Jul 18, 2025Updated 10 months ago
- Official code for "Audio-Guided Attention Network for Weakly Supervised Violence Detection" (ICCECE2022).☆13Mar 25, 2022Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆119Feb 19, 2024Updated 2 years ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25May 14, 2026Updated last month
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆37Nov 13, 2024Updated last year
- ☆101May 16, 2024Updated 2 years ago
- ☆209Jul 12, 2024Updated last year
- [CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language☆17Apr 4, 2023Updated 3 years ago
- Large-scale text-video dataset. 10 million captioned short videos.☆682Aug 14, 2024Updated last year