Official repository for U-SAM (Interspeech 2025)
☆26Jun 3, 2025Updated 9 months ago
Alternatives and similar repositories for U-SAM
Users that are interested in U-SAM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆44Mar 3, 2025Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 11 months ago
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- A toolkit dedicate for speech evaluation.☆23Sep 26, 2024Updated last year
- Real-time end-to-end singing voice convertion☆24Nov 3, 2024Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆69Nov 1, 2024Updated last year
- ☆11Oct 14, 2023Updated 2 years ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 10 months ago
- An end to end ASR Transformer model training repo☆13Dec 8, 2021Updated 4 years ago
- 🤗 R1-AQA Model: mispeech/r1-aqa☆317Mar 28, 2025Updated 11 months ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- 数据库的模拟数据生成工具,帮助用户高效快捷地生成测试环境的数据库数据。☆15Oct 28, 2025Updated 4 months ago
- ☆60Oct 22, 2025Updated 5 months ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 3 years ago
- ☆12Nov 25, 2023Updated 2 years ago
- Official implementation of "AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and sta…☆50Nov 11, 2025Updated 4 months ago
- Fully Quantized Neural Networks For Speech Enhancement☆63Feb 15, 2024Updated 2 years ago
- Code for ACL 2023 main conference paper "CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation"☆17Oct 29, 2024Updated last year
- List of NN based singal processing papers☆22Jun 5, 2023Updated 2 years ago
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 2 years ago
- JEPAs for audio representation learning☆19Jun 22, 2025Updated 9 months ago
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆17Oct 25, 2023Updated 2 years ago
- Reference-aware automatic speech evaluation toolkit☆180Dec 5, 2024Updated last year
- NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis☆151Feb 11, 2023Updated 3 years ago
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆28Aug 4, 2023Updated 2 years ago
- Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation☆32Mar 8, 2024Updated 2 years ago
- ☆25Jan 2, 2024Updated 2 years ago
- ☆13Aug 21, 2022Updated 3 years ago
- The implementation of "End-to-End Neural Speaker Diarization with an Iterative Adaptive Attractor Estimation", which is accepted by Neura…☆11Aug 27, 2023Updated 2 years ago
- ☆23Jan 29, 2026Updated last month
- ☆23Jun 30, 2023Updated 2 years ago
- This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the …☆54Dec 23, 2024Updated last year
- Google collab for testing SoftVC VITS Singing Voice Conversion for AI capable of changing the singer within music files.☆13Apr 21, 2023Updated 2 years ago
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆25Sep 19, 2025Updated 6 months ago
- Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny☆15Oct 30, 2025Updated 4 months ago
- Seamlessly shorten songs using AWS Lambda☆19Apr 28, 2019Updated 6 years ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- Minimalistic serverless boilerplate based on NextJS and Firebase 🔥☆15Jan 7, 2023Updated 3 years ago