Starter code for working with the YouTube-8M dataset.
☆16Jun 9, 2017Updated 8 years ago
Alternatives and similar repositories for youtube-8m
Users that are interested in youtube-8m are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Content-Based Video-Music Retrieval using Soft Intra-Modal Structure Constraint☆62Sep 22, 2017Updated 8 years ago
- Keras Implementation of "Look, Listen and Learn" Model☆21Nov 14, 2017Updated 8 years ago
- Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018☆209Apr 3, 2021Updated 5 years ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Feb 24, 2025Updated last year
- use keras to do image classification tasks☆12Dec 29, 2018Updated 7 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- SongDriver2 achieves a balance between real-time emotion fit and soft transitions, enhancing the coherence of the generated music.☆11Nov 15, 2025Updated 5 months ago
- ☆21Jul 15, 2024Updated last year
- Power-Guided Grouped SRU for Real-Time Causal Audio-Visual Speech Separation☆26Nov 4, 2025Updated 6 months ago
- Official Implementation of BiFlow https://arxiv.org/abs/2512.10953☆50Feb 27, 2026Updated 2 months ago
- More reliable Video Understanding Evaluation☆15Sep 23, 2025Updated 7 months ago
- Code for COBRA: Contrastive Bi-Modal Representation Algorithm (https://arxiv.org/abs/2005.03687)☆15Jul 6, 2023Updated 2 years ago
- Official repository for the paper "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs"☆21Sep 7, 2025Updated 8 months ago
- F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…☆36Jul 3, 2025Updated 10 months ago
- Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learning☆15Apr 25, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- build vgg16 with pytorch 0.4.0 for classification of CIFAR datasets☆10Mar 31, 2019Updated 7 years ago
- ☐ ☐ A simple, out-of-the-box and cross-platform bbox annotation tool by Python. Try it by `pip install easybox`☆10May 28, 2021Updated 4 years ago
- Automated Music Therapy Sessions using Iris and Face Detection☆11Sep 24, 2019Updated 6 years ago
- Source code of our ACM MM 2019 paper "A New Benchmark and Approach for Fine-grained Cross-media Retrieval".☆58Dec 20, 2023Updated 2 years ago
- Official implementation of Efficient Speech Separation Framework Based on Neural State-Space Models☆28Feb 25, 2026Updated 2 months ago
- 支持多种 Linux 发行版的交互式/自动化 NVIDIA 驱动安装脚本☆45Apr 27, 2026Updated last week
- ☆10Nov 27, 2024Updated last year
- ☆13Aug 21, 2022Updated 3 years ago
- This is the official repository for "Can GPTs Evaluate Graphic Design Based on Design Principles?".☆13Feb 10, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Codebase for ECCV18 "The Sound of Pixels"☆392Apr 25, 2022Updated 4 years ago
- Code for Interspeech2022 paper DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion☆13May 6, 2023Updated 3 years ago
- An adjustment of the existing Virtual Makeup repository https://github.com/srivatsan-ramesh/Virtual-Makeup and https://github.com/badarsh…☆11Mar 13, 2020Updated 6 years ago
- RLBench simulation project for autonomous bin picking using Pandas robot arm☆10Mar 1, 2021Updated 5 years ago
- ☆26Aug 4, 2020Updated 5 years ago
- Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)☆32Feb 28, 2025Updated last year
- Stable-diffusion-WebUI extensions, which enable tensorrt accelerated Unet for SDXL base model☆12Oct 18, 2023Updated 2 years ago
- [Secondary Development] detection with ConvNext backbone☆15May 24, 2022Updated 3 years ago
- [ACMMM 2022] ReCoRo: Region-Controllable Robust Light Enhancement by User-Specified Imprecise Masks☆15Feb 6, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A tool built on top of OpenFace to detect eye contact with babies.☆13Nov 27, 2018Updated 7 years ago
- Tencent_AILab_ChineseEmbedding☆12Dec 30, 2018Updated 7 years ago
- ☆35Jul 8, 2025Updated 10 months ago
- Fast Many Face Detection with C++/OpenFrameworks on macOS using Neural Networks☆15Apr 19, 2019Updated 7 years ago
- This is a PyTorch implementation of baseline model of IROS2019 lifelong object recognition challenge.☆15Oct 3, 2023Updated 2 years ago
- Local Feature Matching: Computer Vision University Project☆20Apr 15, 2020Updated 6 years ago
- ☆15Apr 15, 2024Updated 2 years ago