Official repository for U-SAM (Interspeech 2025)
☆25Jun 3, 2025Updated 9 months ago
Alternatives and similar repositories for U-SAM
Users that are interested in U-SAM are comparing it to the libraries listed below
Sorting:
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆43Mar 3, 2025Updated last year
- A toolkit dedicate for speech evaluation.☆23Sep 26, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 11 months ago
- The implementation of "End-to-End Neural Speaker Diarization with an Iterative Adaptive Attractor Estimation", which is accepted by Neura…☆11Aug 27, 2023Updated 2 years ago
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆38Feb 17, 2026Updated 2 weeks ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆68Nov 1, 2024Updated last year
- Official implementation of "AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and sta…☆50Nov 11, 2025Updated 3 months ago
- Differentiable Mean Opinion Score Regularization for Perceptual Speech Enhancement☆25Apr 16, 2023Updated 2 years ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 2 years ago
- List of NN based singal processing papers☆22Jun 5, 2023Updated 2 years ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 8 months ago
- ☆23Jun 30, 2023Updated 2 years ago
- Landing Page for Divide and Remaster v3☆25Jul 29, 2025Updated 7 months ago
- Real-time end-to-end singing voice convertion☆24Nov 3, 2024Updated last year
- An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control☆31Jan 13, 2026Updated last month
- ☆29Jan 15, 2025Updated last year
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆25Sep 19, 2025Updated 5 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- ☆59Oct 22, 2025Updated 4 months ago
- HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz☆24Jan 2, 2024Updated 2 years ago
- ☆25Jan 2, 2024Updated 2 years ago
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆28Aug 4, 2023Updated 2 years ago
- NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis☆151Feb 11, 2023Updated 3 years ago
- Code for INTERSPEECH 2023 paper "mdctGAN: Taming transformer-based GAN for speech super-resolution with Modified DCT spectra"☆66Jun 3, 2023Updated 2 years ago
- Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation☆32Mar 8, 2024Updated last year
- Official implementation of MelHuBERT☆68Feb 21, 2026Updated last week
- Reference-aware automatic speech evaluation toolkit☆180Dec 5, 2024Updated last year
- ☆32Jan 9, 2024Updated 2 years ago
- ☆62May 31, 2024Updated last year
- Algorithm for blind estimation of reverberation time☆35Jun 6, 2024Updated last year
- Filtering and Noise Adding Tool☆29May 27, 2022Updated 3 years ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- NOMAD: Non-Matching Audio Distance (ICASSP 2024)☆30Jun 17, 2025Updated 8 months ago
- 🤗 R1-AQA Model: mispeech/r1-aqa☆314Mar 28, 2025Updated 11 months ago
- Official Implementation (Pytorch) of "Constant Acceleration Flow", NeurIPS 2024☆35Oct 31, 2025Updated 4 months ago
- ☆36Sep 6, 2025Updated 5 months ago
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆39Nov 20, 2024Updated last year