Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms
β18Oct 8, 2023Updated 2 years ago
Alternatives and similar repositories for MelSpec_GPT_VQVAE
Users that are interested in MelSpec_GPT_VQVAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PyTorch implementation of Retriever: Learning Content-Style Representationβ12Jan 27, 2023Updated 3 years ago
- π΅ Partnership with AI to create Beatsβ10Oct 13, 2020Updated 5 years ago
- β37May 8, 2021Updated 4 years ago
- A repository comprising of code for generation of noisy speech data from clean data using deep learning methodsβ16Jul 12, 2021Updated 4 years ago
- TTSεοΌζζ¬ζ εεοΌε°ζ°εεζ―ε€η转εδΈΊζ±εβ12Apr 27, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Official repository for "Structure-Enhanced Pop Music Generation via Harmony-Aware Learning", ACM MM 2022.β15Mar 22, 2023Updated 3 years ago
- Wave-U-Net for automatic (drum) mixingβ38Mar 24, 2023Updated 3 years ago
- Crawled from FreeMidi.org, MIDI files library including over twenty thousand files!β32Jun 6, 2020Updated 5 years ago
- python wrap for hts engineβ14Jan 30, 2018Updated 8 years ago
- β12Jul 6, 2023Updated 2 years ago
- β25Apr 24, 2019Updated 6 years ago
- β15May 8, 2021Updated 4 years ago
- β19Feb 2, 2023Updated 3 years ago
- The source code for the paper CrossSinger (asru2023)β18Oct 12, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- β69Mar 31, 2021Updated 4 years ago
- β44Jun 10, 2024Updated last year
- Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editingβ89Sep 6, 2024Updated last year
- ICASSP 2021 accepted papers in term of voice conversion (VC)β18Apr 11, 2021Updated 4 years ago
- A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.β114Jun 4, 2025Updated 9 months ago
- ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice forβ¦β44Dec 17, 2020Updated 5 years ago
- A fundamental frequency estimation algorithm using features from the magnitude and phase spectrogram.β24Mar 29, 2021Updated 5 years ago
- The DJ Mix Datasetβ17Sep 7, 2022Updated 3 years ago
- Please visit https://thuhcsi.github.io/SnakeGAN/β37Apr 25, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Please visit: https://thuhcsi.github.io/icassp2021-emotion-tts/β34Mar 17, 2023Updated 3 years ago
- Phonemes and durations labeling based on whisper smallβ11Jul 7, 2024Updated last year
- ISMIR 2020 Paper repo: Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythmβ83Oct 3, 2023Updated 2 years ago
- β35Sep 7, 2022Updated 3 years ago
- β55Jan 13, 2023Updated 3 years ago
- Realtime (streaming) DDSP in PyTorch compatible with neutoneβ50Feb 4, 2025Updated last year
- β80Aug 8, 2025Updated 7 months ago
- β45Dec 16, 2019Updated 6 years ago
- This is an unofficial implementation of universal melgan according to https://arxiv.org/abs/2011.09631β23Aug 15, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- β11May 7, 2022Updated 3 years ago
- visual-text to speechβ14Apr 3, 2022Updated 3 years ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approachβ20Aug 2, 2021Updated 4 years ago
- β26Sep 22, 2022Updated 3 years ago
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)β26Feb 22, 2024Updated 2 years ago
- Real-time melgan based on cpu οΌοΌοΌβ13Dec 3, 2019Updated 6 years ago
- β181Oct 24, 2023Updated 2 years ago