gkrsv / split_audioLinks
A rough and ready Python utility which splits audio files based on silence and desired min/max chunk duration.
☆16Updated 3 years ago
Alternatives and similar repositories for split_audio
Users that are interested in split_audio are comparing it to the libraries listed below
Sorting:
- Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.☆360Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆375Updated last year
- Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)☆218Updated 2 years ago
- A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech☆459Updated last year
- My-Voice Analysis is a Python library for the analysis of voice (simultaneous speech, high entropy) without the need of a transcription. …☆330Updated 4 years ago
- A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.☆269Updated 3 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆153Updated last year
- A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.☆17Updated 6 years ago
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆315Updated 4 years ago
- A tokenizer, text cleaner, and phonemizer for many human languages.☆331Updated last year
- A model that predicts the punctuation of English, Italian, French and German texts.☆83Updated 2 years ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆232Updated 3 years ago
- [WIP] VoiceSmith makes training text to speech models easy.☆228Updated 3 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆345Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆174Updated 2 years ago
- Speech Toolkit for Malaysian language, https://malaya-speech.readthedocs.io/☆275Updated 3 months ago
- Grapheme to phoneme conversion with deep learning.☆416Updated 2 years ago
- HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools☆467Updated 2 years ago
- Speech noise reduction which was generated using existing post-production techniques implemented in Python☆181Updated 4 years ago
- A python package for deep multilingual punctuation prediction.☆153Updated last year
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆265Updated last year
- ☆205Updated 3 years ago
- ☆45Updated 5 months ago
- [WIP] Scripts for fine-tuning Whisper☆222Updated 2 years ago
- Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow☆129Updated 4 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80Updated 2 years ago
- ☆262Updated 3 years ago
- A python library to generate speech dataset from Youtube videos☆36Updated last year
- Finetune VITS and MMS using HuggingFace's tools☆187Updated last year
- VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network☆321Updated last year