用于机器学习的语音特征提取,包含FBank和MFCC等,原理讲解和step by step的实现
☆54May 17, 2019Updated 7 years ago
Alternatives and similar repositories for SpeechProcessForMachineLearning
Users that are interested in SpeechProcessForMachineLearning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Calculate MFCC/Fbank feature for wav files☆15Nov 21, 2017Updated 8 years ago
- deep neural network based workflow for noise suppression and signal recovery of real-world LIGO observational data☆16Mar 14, 2024Updated 2 years ago
- python codes to extract MFCC and FBANK speech features for Kaldi☆67Nov 28, 2018Updated 7 years ago
- Flask webapp/endpoint that compares the user's speech with different accents and assigns similarity scores based on speed, voice (DTW/MFC…☆18Jun 27, 2017Updated 8 years ago
- A set of scripts that extract speech features (so far MFCCs, FBANKs, STFT, and kinda dominant frequency) and trains CNN, LSTM, or CNN+LST…☆55Mar 24, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆22Jul 28, 2018Updated 7 years ago
- ☆10May 22, 2023Updated 3 years ago
- Repository for my paper: Dimensional Speech Emotion Recognition Using Acoustic Features and Word Embeddings using Multitask Learning☆17Aug 2, 2024Updated last year
- A Python neural network made with TensorFlow that converts one person's voice into another.☆10Jan 16, 2021Updated 5 years ago
- A summary of speech data augment algorithms☆68Jan 12, 2021Updated 5 years ago
- 采用端到端方法构建声学模型,以字为建模单元,采用DCNN-CTC网络结构。☆70Jan 26, 2019Updated 7 years ago
- 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition☆119Jun 22, 2022Updated 3 years ago
- PyTorch implementation of Continuous Speech Separation☆12Oct 5, 2022Updated 3 years ago
- Generative Adversarial Networks for different impaired speech conversions☆39Jul 6, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- iSeparate library for the SDX2023 challenge☆15Dec 15, 2023Updated 2 years ago
- Removes silence segments from wav audio files☆30Feb 29, 2020Updated 6 years ago
- ☆16Sep 4, 2019Updated 6 years ago
- This library provides common speech features for ASR including MFCCs and filterbank energies.☆2,422Oct 20, 2021Updated 4 years ago
- A set of speech feature extraction functions for ASR and speaker identification written in matlab.☆43Oct 28, 2016Updated 9 years ago
- Code for ACM MM2020 paper: Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization☆34Sep 3, 2020Updated 5 years ago
- Code for Vision-Infused Deep Audio Inpainting (ICCV 2019)☆58Oct 25, 2019Updated 6 years ago
- Supervised Speech Representation Learning for Parkinson's Disease Classification☆17Oct 26, 2021Updated 4 years ago
- ☆11Sep 26, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection☆28Jul 6, 2022Updated 3 years ago
- Speaker identification using voice MFCCs and GMM☆55Dec 13, 2020Updated 5 years ago
- A Python 2.7 implementation of Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) algorithms for Automated Speech …☆17Apr 23, 2018Updated 8 years ago
- Gammatone feature for robust speech recognition☆14Aug 1, 2016Updated 9 years ago
- Whisper to Normal Speech Conversion with SC-MelGAN and SC-VQ-VAE☆15Dec 3, 2022Updated 3 years ago
- Feature extraction for accented-speech or pathological speech☆18Apr 2, 2019Updated 7 years ago
- Mispronunciation detection code for jingju singing voice☆19Sep 5, 2018Updated 7 years ago
- 基于GMM与MFCC特征进行数字0-9的语音识别,GMM,MFCC,语音识别,中文数据,sklearn,Digital Voice Recognition。☆18Jun 21, 2022Updated 3 years ago
- Optimizing speaker verification and spoofing countermeasure systems together with REINFORCE☆13Mar 31, 2021Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Counts frequencies of words using movie and television subtitles.☆20Jan 26, 2015Updated 11 years ago
- Speech recognition using Linear Predictive Cepstral Coefficients and Dynamic Time Wrapping algorithm.☆15Feb 19, 2014Updated 12 years ago
- ☆13Sep 23, 2025Updated 8 months ago
- A bunch of experiments using Bark and Mel scales, wavelets and paraconsistent feature engineering in order to find the best methods to cl…☆13Aug 16, 2023Updated 2 years ago
- Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)☆16May 8, 2022Updated 4 years ago
- A No-Recurrence Sequence-to-Sequence Model for Speech Recognition☆378Jul 21, 2022Updated 3 years ago
- A comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and the use of the tunable Q-factor …☆15Dec 8, 2022Updated 3 years ago