karchkha/MelSpec_GPT_VQVAE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/karchkha/MelSpec_GPT_VQVAE)

karchkha / MelSpec_GPT_VQVAE

Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms

☆18

Alternatives and similar repositories for MelSpec_GPT_VQVAE

Users that are interested in MelSpec_GPT_VQVAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

revsic / torch-retriever-vc
View on GitHub
PyTorch implementation of Retriever: Learning Content-Style Representation
☆12Jan 27, 2023Updated 3 years ago
ChenPaulYu / beats-with-you
View on GitHub
🎵 Partnership with AI to create Beats
☆11Oct 13, 2020Updated 5 years ago
rhoposit / multilingual_VQVAE
View on GitHub
☆37May 8, 2021Updated 5 years ago
shashankshirol / GeneratingNoisySpeechData
View on GitHub
A repository comprising of code for generation of noisy speech data from clean data using deep learning methods
☆16Jul 12, 2021Updated 5 years ago
errolyan / text_normalization_CH
View on GitHub
TTS前，文本标准化，将数字字母处理转化为汉字
☆12Apr 27, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
f90 / Mix-Wave-U-Net
View on GitHub
Wave-U-Net for automatic (drum) mixing
☆38Mar 24, 2023Updated 3 years ago
RMSnow / HAT
View on GitHub
Official repository for "Structure-Enhanced Pop Music Generation via Harmony-Aware Learning", ACM MM 2022.
☆14Mar 22, 2023Updated 3 years ago
josephding23 / Free-Midi-Library
View on GitHub
Crawled from FreeMidi.org, MIDI files library including over twenty thousand files!
☆33Jun 6, 2020Updated 6 years ago
i3thuan5 / hts_engine_python
View on GitHub
python wrap for hts engine
☆14Jan 30, 2018Updated 8 years ago
LEEYOONHYUNG / GraphTTS
View on GitHub
☆12Jul 6, 2023Updated 3 years ago
yunyikristy / ttsGAN-ICLR2019
View on GitHub
☆25Apr 24, 2019Updated 7 years ago
rhoposit / icassp2021
View on GitHub
☆15May 8, 2021Updated 5 years ago
zengchang233 / CrossSinger
View on GitHub
The source code for the paper CrossSinger (asru2023)
☆18Oct 12, 2023Updated 2 years ago
CODEJIN / XiaoiceSing2
View on GitHub
☆19Feb 2, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
LeoniusChen / Attentions-in-Tacotron
View on GitHub
☆69Mar 31, 2021Updated 5 years ago
SpeechResearch / speechresearch.github.io
View on GitHub
☆44Jun 10, 2024Updated 2 years ago
richardbaihe / a3t
View on GitHub
Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
☆89Sep 6, 2024Updated last year
xinshengwang / ICASSP2021_paper_list-VC
View on GitHub
ICASSP 2021 accepted papers in term of voice conversion (VC)
☆18Apr 11, 2021Updated 5 years ago
ajinkyakulkarni14 / ERISHA
View on GitHub
ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…
☆44Dec 17, 2020Updated 5 years ago
bastibe / MAPS-Scripts
View on GitHub
A fundamental frequency estimation algorithm using features from the magnitude and phase spectrogram.
☆25Mar 29, 2021Updated 5 years ago
thuhcsi / SnakeGAN
View on GitHub
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Apr 25, 2023Updated 3 years ago
thuhcsi / icassp2021-emotion-tts
View on GitHub
Please visit: https://thuhcsi.github.io/icassp2021-emotion-tts/
☆34Mar 17, 2023Updated 3 years ago
Infinity-INF / fast-phasr
View on GitHub
Phonemes and durations labeling based on whisper small
☆11Jul 7, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
wq2012 / CurriculumVitae
View on GitHub
Curriculum Vitae of Quan Wang
☆15Updated this week
RetroCirce / Music-SketchNet
View on GitHub
ISMIR 2020 Paper repo: Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm
☆84Oct 3, 2023Updated 2 years ago
MiniXC / LightningFastSpeech2
View on GitHub
☆55Jan 13, 2023Updated 3 years ago
ytsrt66589 / pjloop-gan
View on GitHub
☆35Sep 7, 2022Updated 3 years ago
yanggeng1995 / Multi-band-WaveRNN
View on GitHub
☆45Dec 16, 2019Updated 6 years ago
hs-oh-prml / EmotionControllableTextToSpeech
View on GitHub
☆21Jun 16, 2021Updated 5 years ago
hyakuchiki / realtimeDDSP
View on GitHub
Realtime (streaming) DDSP in PyTorch compatible with neutone
☆51Feb 4, 2025Updated last year
spring-media / DeepForcedAligner
View on GitHub
☆81Aug 8, 2025Updated 11 months ago
cyhuang-tw / robust-vc
View on GitHub
☆11May 7, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Yoshifumi-Nakano / visual-text-to-speech
View on GitHub
visual-text to speech
☆14Apr 3, 2022Updated 4 years ago
avi33 / universalmelgan
View on GitHub
This is an unofficial implementation of universal melgan according to https://arxiv.org/abs/2011.09631
☆23Aug 15, 2022Updated 3 years ago
uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
walker-hyf / FCTalker
View on GitHub
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)
☆26Feb 22, 2024Updated 2 years ago
CODEJIN / VITS_Diffusion
View on GitHub
☆26Sep 22, 2022Updated 3 years ago
rgzn-aiyun / melgan-cpu
View on GitHub
Real-time melgan based on cpu ！！！
☆13Dec 3, 2019Updated 6 years ago
jhtonyKoo / music_mixing_style_transfer
View on GitHub
☆181Oct 24, 2023Updated 2 years ago