SangwonSUH / realtime_YAMNETLinks
Simple real-time Sound Event Detector based on YAMNet and pyaudio.
☆23Updated 5 years ago
Alternatives and similar repositories for realtime_YAMNET
Users that are interested in realtime_YAMNET are comparing it to the libraries listed below
Sorting:
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆65Updated 4 years ago
- This repository contains the code related to the paper 'DENet: a deep architecture for audio surveillance applications'.☆42Updated last year
- Classify daily life events using audio data.☆52Updated 5 years ago
- ☆93Updated 2 years ago
- music genre classification : LSTM vs Transformer☆61Updated 2 years ago
- Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clust…☆44Updated 4 years ago
- A TFLite-compatible fork of YAMNet from tensorflow/models☆30Updated 5 years ago
- Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments☆108Updated last year
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).☆142Updated 2 years ago
- Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.☆107Updated 2 years ago
- 🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.☆42Updated 3 years ago
- Speaker identification using voice MFCCs and GMM☆54Updated 4 years ago
- General purpose sound recognition demo☆157Updated last year
- ☆107Updated 4 years ago
- Kaldi based speaker verification☆47Updated 7 years ago
- This project is about performing Speaker diarization for Hindi Language.☆50Updated 4 years ago
- speaker_diarization done on toy dataset and tested on timit dataset☆7Updated 3 years ago
- Multi-Task Audio Source Separation, Two-Stage Model, Complex Domain.☆93Updated 2 years ago
- Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)☆143Updated last year
- This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1☆112Updated 6 years ago
- Feature extraction from sound signals along with complete CNN model and evaluations using tensorflow, keras and, librosa for MFCC generat…☆10Updated 3 years ago
- ☆46Updated 9 months ago
- ☆84Updated 2 years ago
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆67Updated 3 years ago
- Voice Activity Detection (VAD) using deep learning.☆196Updated 5 years ago
- This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training …☆286Updated 7 months ago
- 📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).☆103Updated last year
- EfficientNet-Absolute Zero for Continuous Speech Keyword Spotting☆23Updated 3 years ago
- Speech Emotion Recognition☆42Updated last year
- A statistical model-based Voice Activity Detection☆192Updated 6 years ago