pengzhendong / streaming-asr
One command to start a streaming ASR server.
☆11Updated 5 months ago
Alternatives and similar repositories for streaming-asr:
Users that are interested in streaming-asr are comparing it to the libraries listed below
- CTC decoder with hotwords for ASR.☆17Updated 2 months ago
- ☆10Updated 2 years ago
- ☆10Updated 5 months ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆9Updated 2 years ago
- faster inference☆27Updated 2 months ago
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆21Updated this week
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆12Updated last month
- silero-vad pytorch implement☆16Updated 4 months ago
- ☆10Updated 4 months ago
- ☆13Updated last year
- Megatts2 use HierSpeechpp's vocoder☆18Updated 3 months ago
- ☆11Updated last month
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆12Updated last week
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Updated last year
- (WIP)long form speech generatoins☆30Updated 3 months ago
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated 6 months ago
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆15Updated 2 weeks ago
- Official implementation of the APSIPA 2022 paper: Exploring Speaker Age Estimation on Different Self-Supervised Learning Models☆14Updated 2 years ago
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆25Updated 6 months ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆10Updated last year
- [ASRU 2023] Code of paper SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation☆18Updated 7 months ago
- A simple command line tool to calculate WER for ASR.☆14Updated 5 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆19Updated 6 months ago
- ☆13Updated 4 months ago
- ☆11Updated 3 years ago
- source code of EfficientTTS 2☆12Updated last year
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆39Updated last year
- ☆26Updated last year
- Just another FastSpeech 2 but cleaner code :)☆26Updated 8 months ago