用于机器学习的语音特征提取,包含FBank和MFCC等,原理讲解和step by step的实现
☆53May 17, 2019Updated 6 years ago
Alternatives and similar repositories for SpeechProcessForMachineLearning
Users that are interested in SpeechProcessForMachineLearning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Calculate MFCC/Fbank feature for wav files☆15Nov 21, 2017Updated 8 years ago
- python codes to extract MFCC and FBANK speech features for Kaldi☆67Nov 28, 2018Updated 7 years ago
- Flask webapp/endpoint that compares the user's speech with different accents and assigns similarity scores based on speed, voice (DTW/MFC…☆18Jun 27, 2017Updated 8 years ago
- ☆22Jul 28, 2018Updated 7 years ago
- Repository for my paper: Dimensional Speech Emotion Recognition Using Acoustic Features and Word Embeddings using Multitask Learning☆17Aug 2, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A summary of speech data augment algorithms☆69Jan 12, 2021Updated 5 years ago
- ☆15Nov 25, 2020Updated 5 years ago
- VoiceCode is an Open Source initiative started by the National Research Council of Canada, to develop a programming by voice toolbox. The…☆10Apr 17, 2020Updated 6 years ago
- 采用端到端方法构建声学模型,以字为建模单元,采用DCNN-CTC网络结构。☆70Jan 26, 2019Updated 7 years ago
- 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition☆118Jun 22, 2022Updated 3 years ago
- PyTorch implementation of Continuous Speech Separation☆12Oct 5, 2022Updated 3 years ago
- ☆18Nov 15, 2021Updated 4 years ago
- Removes silence segments from wav audio files☆30Feb 29, 2020Updated 6 years ago
- ☆16Sep 4, 2019Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This library provides common speech features for ASR including MFCCs and filterbank energies.☆2,422Oct 20, 2021Updated 4 years ago
- A set of speech feature extraction functions for ASR and speaker identification written in matlab.☆43Oct 28, 2016Updated 9 years ago
- Code for Vision-Infused Deep Audio Inpainting (ICCV 2019)☆58Oct 25, 2019Updated 6 years ago
- Supervised Speech Representation Learning for Parkinson's Disease Classification☆17Oct 26, 2021Updated 4 years ago
- ☆11Sep 26, 2022Updated 3 years ago
- Speaker identification using voice MFCCs and GMM☆54Dec 13, 2020Updated 5 years ago
- Gammatone feature for robust speech recognition☆14Aug 1, 2016Updated 9 years ago
- Whisper to Normal Speech Conversion with SC-MelGAN and SC-VQ-VAE☆15Dec 3, 2022Updated 3 years ago
- Mispronunciation detection code for jingju singing voice☆20Sep 5, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Optimizing speaker verification and spoofing countermeasure systems together with REINFORCE☆13Mar 31, 2021Updated 5 years ago
- Counts frequencies of words using movie and television subtitles.☆20Jan 26, 2015Updated 11 years ago
- Speech recognition using Linear Predictive Cepstral Coefficients and Dynamic Time Wrapping algorithm.☆15Feb 19, 2014Updated 12 years ago
- ☆13Sep 23, 2025Updated 7 months ago
- A bunch of experiments using Bark and Mel scales, wavelets and paraconsistent feature engineering in order to find the best methods to cl…☆13Aug 16, 2023Updated 2 years ago
- code for Multisample-based Contrastive Loss for Top-k Recommendation (IEEE TMM)☆10Nov 23, 2022Updated 3 years ago
- A No-Recurrence Sequence-to-Sequence Model for Speech Recognition☆378Jul 21, 2022Updated 3 years ago
- Delay-effect plugin made with JUCE.☆14Apr 6, 2019Updated 7 years ago
- Audio or speech signal processing guide.☆57Jul 16, 2018Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Pytorch: Channel-wise subband (CWS) input for better voice and accompaniment separation☆102Nov 12, 2021Updated 4 years ago
- MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks☆140Jun 7, 2021Updated 4 years ago
- Java Implementation of the Sonopy Audio Feature Extraction Library by MycroftAI☆16Feb 10, 2020Updated 6 years ago
- DNN and RCED speech enhancement☆20Jan 30, 2024Updated 2 years ago
- Illustrating EM for GMMs and HMMs☆12May 9, 2020Updated 5 years ago
- Acoustic and language models for minorised languages.☆26Sep 30, 2020Updated 5 years ago
- Fourier Controller Networks (FCNet) for Real-Time Decision-Making in Embodied Learning, ICML 2024☆31Jan 2, 2025Updated last year