☆30Jun 12, 2025Updated 9 months ago
Alternatives and similar repositories for TS-Whisper
Users that are interested in TS-Whisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆12Mar 14, 2025Updated last year
- ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS☆33Mar 16, 2023Updated 3 years ago
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆22May 26, 2025Updated 10 months ago
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆30Aug 2, 2025Updated 7 months ago
- Learning Domain-Invariant Transformation for Speaker Verification.☆11Jun 13, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆10Oct 16, 2025Updated 5 months ago
- Implementation of CGMM-MVDR beamforming used for Clarity challenge☆14Jan 14, 2022Updated 4 years ago
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated 11 months ago
- A implement of adaptive score normalization (AS-Norm) in speaker verification/recognition with pytorch☆10Oct 12, 2022Updated 3 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ☆44Apr 2, 2025Updated 11 months ago
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago
- ☆37Jun 30, 2022Updated 3 years ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆111Aug 4, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- This repository follows papers and reports on discrete speech representation learning and speech tokenization methods for speech language…☆15Dec 1, 2023Updated 2 years ago
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…☆56Aug 15, 2025Updated 7 months ago
- Pytorch implementation of Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Pro…☆23Dec 14, 2023Updated 2 years ago
- ☆20Sep 2, 2024Updated last year
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆42Mar 20, 2024Updated 2 years ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆57Apr 14, 2025Updated 11 months ago
- Feedforward Sequential Memory Networks☆16Aug 2, 2022Updated 3 years ago
- ☆18Nov 22, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆12Aug 25, 2023Updated 2 years ago
- Implementation of Sheffield entry for Clarity enhancement challenge.☆18Apr 19, 2022Updated 3 years ago
- ☆11Oct 24, 2022Updated 3 years ago
- ☆17Jul 22, 2024Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆150Jan 16, 2024Updated 2 years ago
- SLT 2024 Challenge: Post-ASR-Speaker-Tagging☆16Jun 16, 2024Updated last year
- MultiSV: scripts for data preparation☆30Jan 18, 2025Updated last year
- Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21☆17May 14, 2022Updated 3 years ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆45Mar 17, 2026Updated last week
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆15Dec 22, 2022Updated 3 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Accompanying code for paper "Attention-Based Contextual Language Model Adaptation for Speech Recognition", submitted to ACL 2021.☆14Jul 25, 2023Updated 2 years ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆64May 19, 2023Updated 2 years ago
- The implementation of "X-TF-GridNet: A Time-Frequency Domain Target Speaker Extraction Network with Adaptive Speaker Embedding Fusion", w…☆98Sep 2, 2025Updated 6 months ago
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- Deep learning project template based on PyTorch Lightning and Hydra.☆13May 26, 2022Updated 3 years ago