dr-pato / SSGDLinks
Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"
☆14Updated 2 years ago
Alternatives and similar repositories for SSGD
Users that are interested in SSGD are comparing it to the libraries listed below
Sorting:
- ☆65Updated 2 years ago
- Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…☆69Updated 4 months ago
- Spherical residual vector quantization (SRVQ)☆30Updated last year
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆69Updated last year
- ☆54Updated 2 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Updated 2 years ago
- Streaming Audiotransformers for online Audio tagging☆48Updated last year
- ☆49Updated 6 months ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆49Updated 6 months ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Updated 2 years ago
- ☆16Updated 10 months ago
- Ultrafast GAN based Vocoder for Text to Speech☆50Updated 3 years ago
- Pytorch implementation of LearnableUpsamplingLayer (NaturalSpeech, Tan et al., 2022)☆55Updated last year
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated last year
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆40Updated 11 months ago
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆69Updated last week
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Updated 2 years ago
- FINALLY: Fast and universal speech enhancement model delivering studio-quality audio for a wide range of recordings.☆21Updated 2 months ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆21Updated 2 years ago
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆43Updated 5 months ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Updated last year
- Implementation of SpatialCodec.☆62Updated 2 years ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆81Updated 10 months ago
- This repo contains conv-tasnet for basis-melgan. If you want to get code of basis-melgan, please refer to FastVocoder.☆21Updated 4 years ago
- A STFT/iSTFT written up in PyTorch using 1D Convolutions☆32Updated last year
- ☆80Updated 3 months ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆42Updated last month
- The implementation of MDNet, which is in submission to Interspeech2022☆14Updated 3 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Updated last year