ChrisNick92 / deep-audio-fingerprinting
A repository for my MSc thesis in Data Science & Machine Learning @ NTUA. A deep learning approach to audio fingerprinting for recognizing songs on real time through the microphone.
☆34Updated 4 months ago
Alternatives and similar repositories for deep-audio-fingerprinting:
Users that are interested in deep-audio-fingerprinting are comparing it to the libraries listed below
- Official PyTorch implementation of CoverHunter☆29Updated 4 months ago
- ☆13Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆53Updated last year
- Official implementation of DualCycleGAN for nonparallel audio super resolution☆53Updated 2 years ago
- An official implementation of the ICASSP 2024 paper: Dual-Path TFC-TDF UNet for Music Source Separation☆85Updated last year
- ☆43Updated 9 months ago
- [Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement☆37Updated 3 months ago
- Streaming Audiotransformers for online Audio tagging☆43Updated 9 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆44Updated 5 months ago
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆19Updated 6 months ago
- ☆47Updated this week
- Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".☆45Updated this week
- Learning differentiable temporal resolution on time-series data.☆36Updated 2 years ago
- ☆21Updated last year
- Production first, nn-based on-device signal processing toolkit.☆64Updated last year
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆25Updated 6 months ago
- TAPE: An End-to-End Timbre-Aware Pitch Estimator☆22Updated last year
- PAM is a no-reference audio quality metric for audio generation tasks☆57Updated 8 months ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆58Updated 8 months ago
- Pytorch implementation of subband decomposition☆92Updated 2 years ago
- Fully Quantized Neural Networks For Speech Enhancement☆61Updated last year
- An invertible and differentiable implementation of the Constant-Q Transform (CQT).☆58Updated 2 years ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆37Updated 5 months ago
- PodcastMix A dataset for separating music and speech in podcasts.☆43Updated 7 months ago
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…☆47Updated 5 months ago
- Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutiv…☆43Updated 3 weeks ago
- ☆23Updated last year
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆70Updated 2 years ago
- ☆13Updated last year
- Inference code for PaSST, using the HEAR API.☆31Updated last year