Starter code for working with the YouTube-8M dataset.
☆16Jun 9, 2017Updated 9 years ago
Alternatives and similar repositories for youtube-8m
Users that are interested in youtube-8m are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Content-Based Video-Music Retrieval using Soft Intra-Modal Structure Constraint☆62Sep 22, 2017Updated 8 years ago
- Keras Implementation of "Look, Listen and Learn" Model☆21Nov 14, 2017Updated 8 years ago
- Cross-modality (visual-auditory) Metric Learning Project☆15Dec 19, 2017Updated 8 years ago
- Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018☆209Apr 3, 2021Updated 5 years ago
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Feb 24, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Pytorch implementation of 'See, Hear, and Read: Deep Aligned Representations'☆33Dec 17, 2018Updated 7 years ago
- ☆10Apr 7, 2022Updated 4 years ago
- Unofficial implementation of music separation model by Luo et.al.☆13Nov 3, 2019Updated 6 years ago
- SongDriver2 achieves a balance between real-time emotion fit and soft transitions, enhancing the coherence of the generated music.☆11Nov 15, 2025Updated 7 months ago
- VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation [TMLR26]☆17Jun 1, 2026Updated 2 weeks ago
- Official repository for the paper "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs"☆21Sep 7, 2025Updated 9 months ago
- F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…☆37Jul 3, 2025Updated 11 months ago
- Automated Music Therapy Sessions using Iris and Face Detection☆11Sep 24, 2019Updated 6 years ago
- The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…☆10Feb 9, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆28Sep 11, 2025Updated 9 months ago
- Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network☆10Dec 12, 2018Updated 7 years ago
- Research project on glyph-based Chinese character embedding. Preparing for EMNLP 2019☆11Mar 18, 2019Updated 7 years ago
- 基于vits与softvc的歌声音色转换模型☆12Jan 9, 2023Updated 3 years ago
- Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing☆24Dec 29, 2021Updated 4 years ago
- This is the official repository for "Can GPTs Evaluate Graphic Design Based on Design Principles?".☆13Feb 10, 2025Updated last year
- [ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer☆325Jun 8, 2025Updated last year
- 利用vgg-16/19预训练模型提取图片的特征☆26Nov 2, 2018Updated 7 years ago
- Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images☆30Jun 14, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Codebase for ECCV18 "The Sound of Pixels"☆393Apr 25, 2022Updated 4 years ago
- An adjustment of the existing Virtual Makeup repository https://github.com/srivatsan-ramesh/Virtual-Makeup and https://github.com/badarsh…☆11Mar 13, 2020Updated 6 years ago
- Reproducible code for Augmentation paper☆17Jan 23, 2019Updated 7 years ago
- RLBench simulation project for autonomous bin picking using Pandas robot arm☆10Mar 1, 2021Updated 5 years ago
- eeg based emotion classification transformer on DEAP data set☆14Feb 26, 2022Updated 4 years ago
- The paper "A Two-Stream Siamese Neural Network for Vehicle Re-Identification by Using Non-Overlapping Cameras"☆31Jan 15, 2020Updated 6 years ago
- Stable-diffusion-WebUI extensions, which enable tensorrt accelerated Unet for SDXL base model☆12Oct 18, 2023Updated 2 years ago
- A library to manipulate Inkscape SVG content using Python 3☆12Apr 28, 2021Updated 5 years ago
- Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)☆32Feb 28, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆22Jul 3, 2025Updated 11 months ago
- Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features☆225Jul 17, 2019Updated 6 years ago
- [ACMMM 2022] ReCoRo: Region-Controllable Robust Light Enhancement by User-Specified Imprecise Masks☆15Feb 6, 2023Updated 3 years ago
- A tool built on top of OpenFace to detect eye contact with babies.☆13Nov 27, 2018Updated 7 years ago
- ☆12Apr 29, 2024Updated 2 years ago
- Fast Many Face Detection with C++/OpenFrameworks on macOS using Neural Networks☆15Apr 19, 2019Updated 7 years ago
- ☆36Jul 8, 2025Updated 11 months ago