xiaomi-research / dasheng-denoiserLinks
Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders
☆45Updated 3 weeks ago
Alternatives and similar repositories for dasheng-denoiser
Users that are interested in dasheng-denoiser are comparing it to the libraries listed below
Sorting:
- Streaming Audiotransformers for online Audio tagging☆45Updated last year
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆92Updated last month
- (WIP)long form speech generatoins☆31Updated 3 months ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆63Updated 11 months ago
- A toolkit for researchers in the multimodal sound separation.☆16Updated last year
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆26Updated last month
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆46Updated 2 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆73Updated 8 months ago
- faster inference☆28Updated 5 months ago
- ☆47Updated 3 months ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆40Updated 10 months ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆54Updated last year
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆14Updated 2 years ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆59Updated 8 months ago
- ☆33Updated last year
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆28Updated 2 months ago
- A neural speech codec based on discrete WavLM representations☆24Updated 10 months ago
- Spherical residual vector quantization (SRVQ)☆30Updated 10 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆59Updated 8 months ago
- A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.☆83Updated last month
- E2E TTS using Conditional Flow Matching (Experimental*)☆70Updated last year
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆19Updated last year
- ☆75Updated this week
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆55Updated 8 months ago
- semantic tokenizer for speech and music☆21Updated last week
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated 11 months ago
- ☆63Updated last year
- ☆28Updated last week
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Updated last year
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆79Updated 6 months ago