ErikEkstedt / VoiceActivityProjection
Voice Activity Projection Models: Self-supervised learning of Turn-taking Events
☆36Updated 3 months ago
Related projects: ⓘ
- Reference-aware automatic speech evaluation toolkit☆95Updated 6 months ago
- Clustering-based methods for overlapping diarization☆68Updated 8 months ago
- ☆69Updated this week
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆171Updated last week
- ☆37Updated 3 years ago
- multilingual speech aligner☆70Updated 10 months ago
- Phoneme segmentation using pre-trained speech models☆49Updated last year
- End-to-end MOdeling of ASR (Automatic Speech Recognition)☆33Updated last year
- Unofficial implementation of miipher☆104Updated 5 months ago
- Deep Articulatory Synthesis and Inversion☆41Updated 7 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆64Updated 11 months ago
- UTokyo-SaruLab MOS Prediction System☆49Updated this week
- Official code for Wav2Seq☆95Updated 2 years ago
- ☆48Updated 11 months ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆70Updated last year
- ☆62Updated 4 months ago
- A sequence-to-sequence voice conversion toolkit.☆84Updated 2 months ago
- The VoxTube dataset official repository☆60Updated 7 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆97Updated last year
- UT-Sarulab MOS prediction system using SSL models☆163Updated 5 months ago
- ☆49Updated last week
- Transcribing Speech with Multinomial Diffusion, training code and models.☆74Updated 11 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆127Updated last year
- Layer-wise analysis of self-supervised pre-trained speech representations☆88Updated last month
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆103Updated last year
- ☆41Updated 7 months ago
- A python library for voice activity detection (VAD) for speech/non-speech segmentation.☆80Updated 2 years ago
- Implementation of SoundStorm built upon SpeechTokenizer.☆98Updated 10 months ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆31Updated last year
- SelfRemaster: SSL Speech Restoration☆81Updated 8 months ago