VideoCC is a dataset containing (video-URL, caption) pairs for training video-text machine learning models. It is created using an automatic pipeline starting from the Conceptual Captions Image-Captioning Dataset.
☆78Dec 5, 2022Updated 3 years ago
Alternatives and similar repositories for videoCC-data
Users that are interested in videoCC-data are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆54Jul 31, 2022Updated 3 years ago
- [NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models☆158Dec 9, 2024Updated last year
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆61Jun 12, 2023Updated 2 years ago
- Code release for "Learning Video Representations from Large Language Models"☆534Oct 1, 2023Updated 2 years ago
- Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"☆47Feb 19, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 3 years ago
- A comprehensive framework to explore whether embodied multimodal models are plausibly resilient☆13Nov 19, 2025Updated 4 months ago
- LL3M: Large Language and Multi-Modal Model in Jax☆74Apr 23, 2024Updated last year
- [TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset☆307Dec 25, 2024Updated last year
- Multi-modality pre-training☆510May 8, 2024Updated last year
- Inverse DALL-E for Optical Character Recognition☆38Oct 14, 2022Updated 3 years ago
- Official PyTorch implementation of the paper "Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs"☆94Jun 6, 2025Updated 9 months ago
- Let's make a video clip☆96Jul 29, 2022Updated 3 years ago
- Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time☆46Jun 11, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆19Dec 22, 2022Updated 3 years ago
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Feb 21, 2022Updated 4 years ago
- ☆180Nov 14, 2025Updated 4 months ago
- CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)☆36Nov 12, 2022Updated 3 years ago
- The implement of Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling☆12Aug 19, 2021Updated 4 years ago
- DataComp: In search of the next generation of multimodal datasets☆773Apr 28, 2025Updated 10 months ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆954Mar 19, 2025Updated last year
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique imag…☆1,102Sep 27, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ImageNet3D: Towards General-Purpose Object-Level 3D Understanding☆21Dec 6, 2024Updated last year
- Easily create large video dataset from video urls☆653Jul 30, 2024Updated last year
- Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models☆47Sep 25, 2023Updated 2 years ago
- ☆13Jul 20, 2024Updated last year
- Implementation of <Symbolic Graphics Programming with Large Language Models>☆38Sep 14, 2025Updated 6 months ago
- Sapsucker Woods 60 Audiovisual Dataset☆18Oct 7, 2022Updated 3 years ago
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆22Nov 8, 2023Updated 2 years ago
- ☆32May 3, 2024Updated last year
- This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts…☆293Feb 12, 2024Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".☆119Oct 9, 2025Updated 5 months ago
- ☆58Apr 24, 2024Updated last year
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Jan 18, 2023Updated 3 years ago
- ☆109Dec 23, 2022Updated 3 years ago
- [ICCV 2023] You Only Look at One Partial Sequence☆343Oct 21, 2023Updated 2 years ago
- [CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》☆151Jun 7, 2023Updated 2 years ago
- This is the official repository for the LENS (Large Language Models Enhanced to See) system.☆355Jul 22, 2025Updated 8 months ago