FIGLAB / DirectionOfVoice
Direction-of-Voice (DoV) Estimation for Intuitive Speech Interaction with Smart Devices Ecosystems
☆34Updated 2 years ago
Alternatives and similar repositories for DirectionOfVoice:
Users that are interested in DirectionOfVoice are comparing it to the libraries listed below
- SoundNet, built in Keras with pre-trained 8-layer model.☆29Updated 5 years ago
- Keras framework for speech enhancement using relativistic GANs☆52Updated 4 years ago
- Benchmark for sound event localization task of DCASE 2019 challenge☆76Updated 4 years ago
- Constrained Permutation Invariant Training, Speech Separation☆47Updated 4 years ago
- The Cone of Silence:☆152Updated 2 years ago
- ☆58Updated 6 years ago
- Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments☆107Updated last year
- Deep Neural Network for Speaker Count Estimation☆148Updated 4 years ago
- Few-Shot Keyword Spotting☆63Updated 3 years ago
- Inspired work by the project of SER using ELM at Microsoft Research☆19Updated 6 years ago
- Deep neural network (DNN) for noise reduction, removal of background music, and speech separation☆172Updated 2 years ago
- This repository provides information on how to use the SINS database along with some example code. The SINS Dataset is composed of conti…☆23Updated 2 years ago
- ☆37Updated 2 years ago
- Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)☆72Updated 3 years ago
- Audio classification with VGGish as feature extractor in TensorFlow☆128Updated 3 years ago
- RASTA-PLP and MFCC tool based rasta-mat☆33Updated 2 years ago
- Instructions on downloading and using the LibriAdapt dataset☆46Updated 3 years ago
- ICASSP2019 Tutorial: Detection and Classification of Acoustic Scenes and Events / Code examples☆41Updated 5 years ago
- Baseline method for sound event localization task of DCASE 2020 challenge☆54Updated 4 years ago
- A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.☆136Updated 5 years ago
- Baseline systems for the FSD50K dataset☆68Updated 3 years ago
- Sound event detection with depthwise separable and dilated convolutions.☆53Updated 5 years ago
- A two step optimization for sound source separation on the adaptive front-end domain☆67Updated 4 years ago
- Audio-Visual Speech Recognition using Sequence to Sequence Models☆82Updated 4 years ago
- Python implementation of pre-processing for End-to-End speech recognition☆69Updated 7 years ago
- Convert kaldi feature extraction and nnet3 models into Tensorflow Lite models. Currently aimed at converting kaldi's x-vector models and …☆20Updated 2 years ago
- Discriminative Neural Clustering for Speaker Diarisation☆78Updated 2 years ago
- ☆60Updated 4 years ago
- Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data☆96Updated last year
- ☆21Updated 6 years ago