SangwonSUH / realtime_YAMNET
Simple real-time Sound Event Detector based on YAMNet and pyaudio.
☆21Updated 5 years ago
Alternatives and similar repositories for realtime_YAMNET:
Users that are interested in realtime_YAMNET are comparing it to the libraries listed below
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆64Updated 3 years ago
- This repository contains the code related to the paper 'DENet: a deep architecture for audio surveillance applications'.☆41Updated last year
- General purpose sound recognition demo☆155Updated last year
- Classify daily life events using audio data.☆50Updated 5 years ago
- ☆103Updated 4 years ago
- Sound Classification using Librosa, ffmpeg, CNN, Keras, XGBOOST, Random Forest.☆66Updated last year
- Pytorch implementation of deep audio embedding calculation☆101Updated last year
- ☆90Updated 2 years ago
- WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models wi…☆89Updated 3 years ago
- A two-stage polyphonic sound event detection and localization method for both SED and DOA.☆110Updated 2 years ago
- 🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.☆41Updated 2 years ago
- An in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering, hyperparameter optimization, model evaluation, …☆75Updated 4 years ago
- Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments☆106Updated 10 months ago
- Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clust…☆44Updated 4 years ago
- ☆42Updated 5 months ago
- Audio classification with VGGish as feature extractor in TensorFlow☆127Updated 3 years ago
- A TFLite-compatible fork of YAMNet from tensorflow/models☆29Updated 4 years ago
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆67Updated 3 years ago
- Implementation of the paper "Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning" From I…☆57Updated 4 years ago
- Classification of 11 types of audio clips using MFCCs features and LSTM. Pretrained on Speech Command Dataset with intensive data augment…☆42Updated 2 years ago
- Detecting emotions using MFCC features of human speech using Deep Learning☆125Updated 4 years ago
- EfficientNet-Absolute Zero for Continuous Speech Keyword Spotting☆23Updated 2 years ago
- An implementation of vggish in keras with tf backend☆117Updated 3 years ago
- In this repository, we explore using a hybrid system consisting of a Convolutional Neural Network and a Support Vector Machine for Keywor…☆97Updated 2 years ago
- speaker_diarization done on toy dataset and tested on timit dataset☆8Updated 3 years ago
- Repository hosting code and slides of the Audio Data Augmentation series on The Sound of AI YT channel.☆37Updated 3 years ago
- 📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).☆100Updated last year
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆39Updated 2 years ago
- Predicting various emotion in human speech signal by detecting different speech components affected by human emotion.☆45Updated 6 months ago
- Few-Shot Keyword Spotting☆63Updated 3 years ago