benluks / streaming-asr
Low-latency ASR using SpeechBrain StreamingASR and torchaudio StreamReader.
☆16Updated this week
Alternatives and similar repositories for streaming-asr:
Users that are interested in streaming-asr are comparing it to the libraries listed below
- ☆84Updated this week
- ☆59Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 5 months ago
- A TTS model that makes a speaker speak new languages☆76Updated 9 months ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 10 months ago
- ☆84Updated last year
- ☆20Updated 2 years ago
- small audio language model for reasoning☆50Updated last week
- ☆103Updated last month
- ☆104Updated this week
- Audio tokenization, in the fastest way possible!☆49Updated 7 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆76Updated last year
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆36Updated 3 months ago
- [Batching/MultiGPU/DataLoader Implemented] Code for the paper Hybrid Spectrogram and Waveform Source Separation☆22Updated last year
- ☆41Updated this week
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆119Updated 3 months ago
- Implementation of Google's USM speech model in Pytorch☆30Updated 2 months ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆51Updated last month
- ☆44Updated 7 months ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated 2 weeks ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- Unofficial implementation of wavenext vocoder☆44Updated 7 months ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated 2 years ago
- Collection of scripts from mHuBERT-147.☆24Updated 4 months ago
- The demo page of UniAudio☆33Updated last year
- Official Code for ParrotTTS☆48Updated 5 months ago
- [ICASSP 2025] Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners".☆14Updated 3 weeks ago
- Audiogen Codec☆131Updated 8 months ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆53Updated last year
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆35Updated 2 years ago