speechmatics / speechmatics-pythonLinks
Python library and CLI for Speechmatics
☆76Updated last month
Alternatives and similar repositories for speechmatics-python
Users that are interested in speechmatics-python are comparing it to the libraries listed below
Sorting:
- ☆359Updated last year
- Various speech datasets made available to the public☆131Updated 9 months ago
- Tunable pipelines☆36Updated 2 weeks ago
- A merged version of multiple open-source German speech datasets.☆33Updated last year
- A python package for whisper normalizer☆65Updated this week
- A tokenizer, text cleaner, and phonemizer for many human languages.☆325Updated 10 months ago
- Zero-shot Audio Classification using Whisper☆79Updated 2 years ago
- Finetune VITS and MMS using HuggingFace's tools☆164Updated last year
- ☆47Updated 2 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆148Updated last year
- Advanced data structures for handling temporal segments with attached labels.☆118Updated last week
- Reproducible experimental protocols for multimedia (audio, video, text) database☆107Updated this week
- Gecko - A Tool for Effective Annotation of Human Conversations☆298Updated 2 years ago
- ☆308Updated last year
- A python package for deep multilingual punctuation prediction.☆132Updated last year
- ☆23Updated last year
- On-device voice activity detection (VAD) powered by deep learning☆230Updated last month
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Updated 2 years ago
- Linguistic processing for Common Voice☆57Updated last year
- ☆128Updated last week
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 3 years ago
- Speaker Diarization with Transformers☆69Updated 3 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆150Updated last year
- The Gridspace-Stanford Harper Valley speech dataset. Created in support of CS224S.☆48Updated 4 years ago
- Repository containing experimentation platform on how to train, infer on wav2vec2 models.☆87Updated 3 years ago
- An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.☆169Updated 4 months ago
- Universal Romanizer that can convert any unicode script to roman (latin) script☆225Updated last year
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆174Updated this week
- A model that predicts the punctuation of English, Italian, French and German texts.☆79Updated 2 years ago
- ☆38Updated 3 years ago