benluks / streaming-asr
Low-latency ASR using SpeechBrain StreamingASR and torchaudio StreamReader.
☆17Updated 2 weeks ago
Alternatives and similar repositories for streaming-asr:
Users that are interested in streaming-asr are comparing it to the libraries listed below
- ☆92Updated last week
- ☆84Updated last year
- Implementation of Google's USM speech model in Pytorch☆31Updated last month
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆39Updated last month
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- ☆20Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆21Updated 8 months ago
- Collection of scripts from mHuBERT-147.☆24Updated 5 months ago
- A TTS model that makes a speaker speak new languages☆76Updated 10 months ago
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆64Updated 4 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆76Updated last year
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆101Updated 3 weeks ago
- ☆46Updated 2 years ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆52Updated 2 weeks ago
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated last year
- Unofficial implementation of wavenext vocoder☆44Updated 8 months ago
- ☆31Updated last month
- small audio language model for reasoning☆61Updated 3 weeks ago
- ☆46Updated 8 months ago
- A low-bitrate single-codebook 16 kHz speech codec based on focal modulation☆86Updated 2 months ago
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆38Updated 4 months ago
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'☆119Updated last month
- Speaker change detection using SincNet and an LSTM/Transformer☆50Updated 10 months ago
- ☆59Updated last year
- ☆37Updated 2 weeks ago
- Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)☆87Updated 5 months ago