Official repository for U-SAM (Interspeech 2025)
☆27Jun 3, 2025Updated 11 months ago
Alternatives and similar repositories for U-SAM
Users that are interested in U-SAM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆47Mar 3, 2025Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- A toolkit dedicate for speech evaluation.☆23Sep 26, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Real-time end-to-end singing voice convertion☆25Nov 3, 2024Updated last year
- ☆11Oct 14, 2023Updated 2 years ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆70Nov 1, 2024Updated last year
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆56May 15, 2025Updated 11 months ago
- An end to end ASR Transformer model training repo☆13Dec 8, 2021Updated 4 years ago
- 🤗 R1-AQA Model: mispeech/r1-aqa☆323Mar 28, 2025Updated last year
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 3 years ago
- 数据库的模拟数据生成工具,帮助用户高效快捷地生成测试环境的数据库数据。☆15Oct 28, 2025Updated 6 months ago
- ☆60Oct 22, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 3 years ago
- ☆12Nov 25, 2023Updated 2 years ago
- Official implementation of "AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and sta…☆50Nov 11, 2025Updated 5 months ago
- Code for ACL 2023 main conference paper "CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation"☆17Oct 29, 2024Updated last year
- Fully Quantized Neural Networks For Speech Enhancement☆63Feb 15, 2024Updated 2 years ago
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 2 years ago
- List of NN based singal processing papers☆22Jun 5, 2023Updated 2 years ago
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆17Oct 25, 2023Updated 2 years ago
- Reference-aware automatic speech evaluation toolkit☆182Dec 5, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- JEPAs for audio representation learning☆20Apr 22, 2026Updated last week
- NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis☆152Feb 11, 2023Updated 3 years ago
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆28Aug 4, 2023Updated 2 years ago
- ☆13Aug 21, 2022Updated 3 years ago
- ☆25Jan 2, 2024Updated 2 years ago
- Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation☆32Mar 8, 2024Updated 2 years ago
- ☆23Jan 29, 2026Updated 3 months ago
- The implementation of "End-to-End Neural Speaker Diarization with an Iterative Adaptive Attractor Estimation", which is accepted by Neura…☆11Aug 27, 2023Updated 2 years ago
- This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the …☆54Dec 23, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆24Jun 30, 2023Updated 2 years ago
- Google collab for testing SoftVC VITS Singing Voice Conversion for AI capable of changing the singer within music files.☆13Apr 21, 2023Updated 3 years ago
- Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny☆15Apr 15, 2026Updated 2 weeks ago
- Seamlessly shorten songs using AWS Lambda☆19Apr 28, 2019Updated 7 years ago
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆25Sep 19, 2025Updated 7 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- Minimalistic serverless boilerplate based on NextJS and Firebase 🔥☆15Jan 7, 2023Updated 3 years ago