BirgerMoell / tmh
☆17Updated 2 years ago
Alternatives and similar repositories for tmh:
Users that are interested in tmh are comparing it to the libraries listed below
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆47Updated 2 years ago
- ☆65Updated 7 months ago
- 56 language, 1 model Multilingual ASR☆25Updated 3 years ago
- ☆38Updated 3 years ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆21Updated 7 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆50Updated 2 years ago
- Official code for Wav2Seq☆96Updated 2 years ago
- multilingual speech aligner☆74Updated last year
- ☆34Updated 3 years ago
- pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper☆21Updated 2 years ago
- demo page https://MingjieChen.github.io/dygan-vc☆67Updated 3 years ago
- ☆16Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆50Updated 9 months ago
- PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis☆56Updated 3 years ago
- MSP-Podcast Challenge Baseline Code☆21Updated 10 months ago
- ☆49Updated 3 years ago
- End-to-end MOdeling of ASR (Automatic Speech Recognition)☆33Updated 2 years ago
- The VoxTube dataset official repository☆68Updated last year
- ☆75Updated 3 years ago
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆38Updated last year
- SHAS: Approaching optimal Segmentation for End-to-End Speech Translation☆38Updated 2 years ago
- ☆12Updated 2 months ago
- ☆79Updated 11 months ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆88Updated 3 weeks ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- ☆21Updated 8 months ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆50Updated 8 months ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆75Updated last year
- [INTERSPEECH'2022] Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning☆81Updated 2 years ago