Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
Alternatives and similar repositories for LibriLightMix-WHAMR
Users that are interested in LibriLightMix-WHAMR are comparing it to the libraries listed below
Sorting:
- The source code of Tim-TSENet☆15Apr 22, 2022Updated 3 years ago
- Data simulation scripts for paper "Target Sound Extraction with Variable Cross-modality Clues"☆16May 19, 2023Updated 2 years ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆42Oct 13, 2023Updated 2 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆56Apr 14, 2025Updated 10 months ago
- Official implementation of Efficient Speech Separation Framework Based on Neural State-Space Models☆23Feb 25, 2026Updated last week
- ☆36Jan 6, 2026Updated last month
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆24Jan 9, 2024Updated 2 years ago
- GPT-style network for phonemization with durations of text☆68Mar 21, 2024Updated last year
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆16Nov 19, 2025Updated 3 months ago
- The official code for the SALMon🍣 benchmark (ICASSP 2025 - Oral)☆49Aug 15, 2025Updated 6 months ago
- PyTorch implementation of WASE described in our ICASSP 2021: "Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Envi…☆26Jan 11, 2022Updated 4 years ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆147Jan 1, 2025Updated last year
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 5 months ago
- The open source code for SimpleSpeech series☆145Oct 8, 2024Updated last year
- ☆52Sep 10, 2024Updated last year
- An ASR toolkit with the freedom of topology☆10Dec 18, 2023Updated 2 years ago
- Dataset simulation for DPCCN.☆16Dec 25, 2022Updated 3 years ago
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 7 months ago
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- Official implementation of Self-Remixing☆17Feb 3, 2024Updated 2 years ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆61Jul 1, 2025Updated 8 months ago
- ☆80Aug 11, 2025Updated 6 months ago
- Audio Research in US. US-based professors who work on audio (music, speech, acoustics). For students who would like to apply for RA, PhD,…☆27Updated this week
- ERB representation of an audio file implemented in Python☆27Oct 21, 2018Updated 7 years ago
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆81Jun 7, 2024Updated last year
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated 11 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆214Sep 10, 2024Updated last year
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆87Dec 20, 2024Updated last year
- ☆11Oct 14, 2023Updated 2 years ago
- Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.☆13Feb 6, 2021Updated 5 years ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- A simple implementation for improving CosyVoice2 by GRPO method☆32Oct 17, 2025Updated 4 months ago
- ☆16Feb 19, 2026Updated last week
- ☆13Apr 18, 2019Updated 6 years ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆56Jun 25, 2024Updated last year
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆45Sep 5, 2025Updated 5 months ago