jim-schwoebel / pauses
π€ quick library to extract pause lengths from audio files.
β31Updated 5 years ago
Related projects: β
- End-to-end spoken language identification out of the box.β48Updated 3 years ago
- [deprecated] Pretrained models for pyannote-audio 1.xβ70Updated 2 years ago
- Gentle and praatio scripts for easy forced alignmentβ18Updated last year
- πΈTTS recipes for different datasetsβ84Updated 2 years ago
- A model that predicts the punctuation of English, Italian, French and German texts.β70Updated last year
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented textβ35Updated 4 years ago
- Speaker diarization python system based on binary key speaker modellingβ61Updated 2 years ago
- A crash course for training speech recognition models using DeepSpeech.β23Updated 3 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ95Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ79Updated 5 months ago
- Speaker diarization via transfer learningβ27Updated 5 years ago
- Advanced data structures for handling temporal segments with attached labels.β95Updated 3 months ago
- Running Mozilla's implementation of Baidu DeepSpeech on Google Colaboratoryβ16Updated 5 years ago
- Simple text to phonemes converter for multiple languagesβ21Updated last year
- Python library for handling audio datasets.β131Updated last year
- automatically align transcribed audio and generate a wav2letter training corpusβ34Updated last year
- OpenAI Whisper Prompt Examplesβ39Updated last year
- A collection of useful tools for handling speech recognition dataβ30Updated last year
- A python package for deep multilingual punctuation prediction.β87Updated 3 weeks ago
- TTS Client for Coqui TTS serverβ13Updated last year
- It is an algorithm analysed the acoustic features of a voice and creates an acoustic classifier - USEFUL for auto-speech-raterβ11Updated 5 years ago
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networksβ61Updated 4 years ago
- Convert Arpabet to IPA. Arpabet is the set of phonemes used by the CMU Pronouncing Dictionary. IPA is the International Phonetic Alphabetβ¦β42Updated 4 years ago
- Deep Learning model for lexical stress detection in spoken Englishβ25Updated 4 years ago
- Code for AccentDB.β20Updated 3 years ago
- Python library for audio augmentationβ83Updated last year
- Deep Neural Networks for audio classificationβ11Updated 5 months ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.β81Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.β100Updated last year
- A module for normalising text.β172Updated 2 years ago