ngbala6 / Audio-Processing
This repo is for Audio Processing Techniques and the Silence Remove using Python
β17Updated 4 years ago
Alternatives and similar repositories for Audio-Processing:
Users that are interested in Audio-Processing are comparing it to the libraries listed below
- TTS Client for Coqui TTS serverβ13Updated 2 years ago
- Model for recasing and repunctuating ASR transcriptsβ133Updated 11 months ago
- πΈSTT integration examplesβ126Updated 2 years ago
- A free & open tool for transcribing audio interviews with offline ASR supportβ24Updated last year
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Pythonβ18Updated last year
- π« check your data, before you wreck your modelβ16Updated 2 years ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zooβ25Updated 2 years ago
- How to create your own model for voskβ70Updated 3 years ago
- β38Updated 3 years ago
- βοΈβοΈ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).β81Updated 9 months ago
- Linguistic processing for Common Voiceβ55Updated last year
- End-to-end spoken language identification out of the box.β48Updated 4 years ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β135Updated last year
- Keras(Tensorflow) implementations of Automatic Speech Recognitionβ23Updated 3 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ109Updated 2 years ago
- β39Updated last year
- Python library for handling audio datasets.β137Updated last year
- An automatic speech recognition APIβ54Updated this week
- DeepSpeech based forced alignment toolβ237Updated 4 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ146Updated 10 months ago
- Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used toβ¦β32Updated 4 years ago
- Timething is a library for aligning text transcripts with their audio recordings.β115Updated 3 months ago
- Speaker diarization python system based on binary key speaker modellingβ61Updated 3 years ago
- Running Mozilla's implementation of Baidu DeepSpeech on Google Colaboratoryβ16Updated 6 years ago
- Tools to create your own voice dataset for TTS trainingβ66Updated 4 years ago
- On-device speaker recognition engine powered by deep learningβ33Updated this week
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ98Updated last month
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented textβ36Updated 4 years ago
- Some simple wrappers around eSpeak NG intended to make using this excellent TTS for waveform and IPA generation as convenient as possibleβ¦β41Updated 5 months ago
- An even smaller speech recognizer / force alignerβ32Updated 3 months ago