jim-schwoebel / sound_event_detectionLinks
π΅ A repository for manually annotating files to create labeled acoustic datasets for machine learning.
β44Updated 3 years ago
Alternatives and similar repositories for sound_event_detection
Users that are interested in sound_event_detection are comparing it to the libraries listed below
Sorting:
- Sound event detection with depthwise separable and dilated convolutions.β53Updated 5 years ago
- Python library for audio augmentationβ84Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ104Updated 5 months ago
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).β143Updated 2 years ago
- Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clustβ¦β44Updated 4 years ago
- End-to-end spoken language identification out of the box.β48Updated 4 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.β102Updated 2 years ago
- WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models wiβ¦β91Updated 4 years ago
- Python toolkit for speech processingβ69Updated 3 weeks ago
- target speaker extraction and verification for multi-talker speechβ179Updated 4 years ago
- β93Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformerβ52Updated last month
- Easy to use Audio Tagging in PyTorchβ22Updated 3 years ago
- Towards Intelligibility-Oriented Audio-Visual Speech Enhancementβ14Updated 10 months ago
- Phase-aware speech enchancement with Deep Complex U-Netβ118Updated 2 years ago
- Implementation for paper "iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Lβ¦β55Updated 2 years ago
- Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.β85Updated 2 years ago
- Zafar's Audio Functions in Python for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT sβ¦β56Updated last year
- Evaluate EfficientAT models on the Holistic Evaluation of Audio Representations Benchmark.β31Updated 2 years ago
- A self-supervised speech denoising strategy named Only-Noisy Training (ONT), which solves the speech denoising problem with only noisy auβ¦β70Updated 2 years ago
- Constrained Permutation Invariant Training, Speech Separationβ47Updated 4 years ago
- Author's repository for reproducing DcaseNet, an integrated pre-trained DNN that performs acoustic scene classification, audio tagging, aβ¦β41Updated 3 years ago
- Evaluation and Benchmarking of Speech Super-resolution Methodsβ151Updated 3 years ago
- Attention Backend for Aotumatic Speaker Verification with Multiple Enrollment Utterancesβ50Updated 2 years ago
- SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognitionβ118Updated last year
- Phoneme segmentation using pre-trained speech modelsβ55Updated 2 years ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.β43Updated 3 years ago
- β13Updated last year
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation suppβ¦β48Updated last year
- LogMMSE speech enhancement/noise reductionβ88Updated 5 years ago