mtkresearch / clairaudience
Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)
☆27Updated last year
Alternatives and similar repositories for clairaudience:
Users that are interested in clairaudience are comparing it to the libraries listed below
- Clustering-based methods for overlapping diarization☆80Updated last year
- ConMamba for Automatic Speech Recognition☆63Updated 7 months ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆108Updated last year
- A curated list of awesome papers on contextualizing E2E ASR outputs☆77Updated last year
- ☆75Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- A list of papers for child ASR☆38Updated 5 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆81Updated last year
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆86Updated 4 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆193Updated 6 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆50Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆48Updated 9 months ago
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆126Updated last month
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆62Updated last week
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆48Updated 9 months ago
- Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"☆72Updated last month
- Reference-aware automatic speech evaluation toolkit☆145Updated 4 months ago
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆94Updated 9 months ago
- Official implementation of the paper "Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Unce…☆21Updated 3 weeks ago
- Layer-wise analysis of self-supervised pre-trained speech representations☆102Updated 5 months ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆112Updated last year
- ☆64Updated 6 months ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆37Updated last year
- End-to-end MOdeling of ASR (Automatic Speech Recognition)☆33Updated 2 years ago
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆151Updated 3 weeks ago
- ☆43Updated 2 years ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆99Updated last year
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆82Updated 2 months ago
- asr2k☆49Updated 10 months ago
- Official repository of NeXt-TDNN for speaker verification☆70Updated 5 months ago