☆19Jun 8, 2021Updated 4 years ago
Alternatives and similar repositories for seeking-the-shape-of-sound
Users that are interested in seeking-the-shape-of-sound are comparing it to the libraries listed below
Sorting:
- SVHF-Net for Cross-modal binary matching☆32Aug 22, 2018Updated 7 years ago
- This is the release code for CVPR2022 paper "Voice-Face Homogeneity Tells Deepfake".☆15Mar 7, 2022Updated 4 years ago
- Voice Face Association Learning Paper List☆17May 20, 2023Updated 2 years ago
- Implementation of our PR 2020 paper:Unsupervised Text-to-Image Synthesis☆13Jul 9, 2020Updated 5 years ago
- A simplified version for DMC (Deep Multimodal Clustering for Unsupervised Audiovisual Learning)☆19May 27, 2020Updated 5 years ago
- ☆17Nov 4, 2022Updated 3 years ago
- ☆19Jul 14, 2019Updated 6 years ago
- Official implementation of FOP method as described in "Fusion and Orthogonal Projection for Improved Face-Voice Association"☆21Dec 31, 2025Updated 2 months ago
- PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)☆25Mar 9, 2024Updated 2 years ago
- ☆28Dec 22, 2021Updated 4 years ago
- Towards Adaptive ML Benchmarks: Web-Agent-Driven Construction, Domain Expansion, and Metric Optimization☆20Sep 12, 2025Updated 5 months ago
- Frequency tracking in time-frequency representations☆13Jan 19, 2021Updated 5 years ago
- (BMVC 2020 Oral) Neighbourhood-Insensitive Point Cloud Normal Estimation Network☆10Jun 30, 2025Updated 8 months ago
- Deepfake faces detection from forged videos where used explainable AI for models' robustness as well as cost sensitive methods for mitiga…☆10May 27, 2024Updated last year
- The official PyTorch implementation for MM'21 paper 'Attribute-specific Control Units in StyleGAN for Fine-grained Image Manipulation'☆39Dec 16, 2021Updated 4 years ago
- Official Code for Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning☆16Jul 24, 2025Updated 7 months ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- MXNet/Gluon implementation of the original (Gaussian) Variational Autoencoders (VAE)☆10Dec 22, 2017Updated 8 years ago
- ☆10Jun 2, 2024Updated last year
- code for paper "learning to fool the speaker recognition"☆10Jun 12, 2020Updated 5 years ago
- [TNNLS 2022] Official pytorch implementation of "Tackling the Challenges in Scene Graph Generation with Local-to-Global Interactions"☆11Apr 19, 2022Updated 3 years ago
- ☆13Aug 7, 2025Updated 7 months ago
- ☆18Aug 7, 2025Updated 7 months ago
- Time frequency ridge detection based on relevant ridge portions☆11Aug 17, 2023Updated 2 years ago
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆23Sep 23, 2025Updated 5 months ago
- This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…☆11Oct 9, 2024Updated last year
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago
- Pytorch implementation of our work "Domain-Invariant Representation Learning of Bird Sounds" (ICASSP 2026)☆11Feb 24, 2026Updated 2 weeks ago
- A public repository for ConDo (AAAI25 accepted)☆10Dec 21, 2024Updated last year
- Patch-Diffusion Code (AAAI2022)☆13Mar 3, 2022Updated 4 years ago
- Pytorch implementation of "Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal T…☆12Mar 9, 2024Updated 2 years ago
- [ECCV 2022] Dual-Evidential Learning for Weakly-supervised Temporal Action Localization☆49Apr 19, 2024Updated last year
- Speaker overlap-aware Neural Diarization☆12Feb 13, 2023Updated 3 years ago
- [NeurIPS 2023] Official Implementation of "PaintSeg: Painting Pixels for Training-free Segmentation"☆14Dec 31, 2023Updated 2 years ago
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago
- [ICTC'24] - "Voice-Based Age and Gender Recognition: A Comparative Study of LSTM, RezoNet and Hybrid CNNs-BiLSTM Architecture" by Nhut Mi…☆10Jan 16, 2025Updated last year
- ☆11Nov 5, 2025Updated 4 months ago
- ☆13Sep 1, 2025Updated 6 months ago
- Examples of how to use API of MVSep service☆29Jun 21, 2025Updated 8 months ago