teo-sl / Audio-Super-Resolution-ViT
This repository contains the source code for the implementation of two deep learning models concerning the audio super resolution task.
☆12Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Audio-Super-Resolution-ViT
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆42Updated 2 months ago
- ☆79Updated last year
- Official implementation of DualCycleGAN for nonparallel audio super resolution☆50Updated 2 years ago
- Adaptive Vocoder for Custom Voice☆58Updated 2 years ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆50Updated last year
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆24Updated 2 years ago
- AudioSR-Upsampling (any -> 48kHz)☆38Updated 9 months ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆48Updated 3 weeks ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆30Updated last year
- Test code disclosure for the research paper "UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model", as a supplementa…☆18Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆70Updated 7 months ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆68Updated last year
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆31Updated 10 months ago
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆28Updated last year
- iSeparate library for the SDX2023 challenge☆13Updated 11 months ago
- GPT-style network for phonemization with durations of text☆62Updated 8 months ago
- ☆50Updated 9 months ago
- A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis☆43Updated last year
- Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction☆33Updated this week
- ☆13Updated 11 months ago
- Zero-Shot Emotion Style Transfer☆37Updated 7 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64Updated last year
- Implementation for "Music Enhancement via Image Translation and Vocoding"☆52Updated 2 years ago
- Repo for source code of EBEN: Extreme Bandwidth Extension Network☆69Updated last month
- An implementation of "Towards Improving Harmonic Sensitivity and Prediction Stability for Singing Melody Extraction", in ISMIR 2023☆19Updated 10 months ago
- Stable Audio UnOffical Implementation: Latent Diffusion for Audio Generation☆23Updated 9 months ago
- ☆40Updated 5 months ago
- ☆32Updated 2 months ago
- Codebase and project page for EDMSound☆29Updated last year
- ☆42Updated last month