madhavlab / wav2tok
Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"
☆31Updated last year
Related projects: ⓘ
- [TOMM 2024] Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing☆16Updated 3 weeks ago
- ☆26Updated this week
- Viterbi decoding in PyTorch☆23Updated 3 weeks ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆27Updated last month
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Updated 11 months ago
- Temporary anonymous version☆22Updated 6 months ago
- GPT for FACodec☆13Updated 5 months ago
- ☆12Updated last year
- Phonemes and durations labeling based on whisper small☆12Updated 2 months ago
- ☆41Updated last year
- Project for MIDI to Audio Synthesis☆19Updated last year
- Digital Speech Processing in PyTorch.☆12Updated 2 years ago
- ☆21Updated this week
- 60k hours of phoneme-aligned audio from audio books☆18Updated last month
- ☆20Updated 2 years ago
- text to speech☆10Updated 6 months ago
- ☆25Updated this week
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆34Updated 8 months ago
- ☆13Updated 2 weeks ago
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆25Updated last year
- The source code for the paper CrossSinger (asru2023)☆18Updated 11 months ago
- My vocoder experiments☆20Updated last month
- Official implementation of DGP-based multi-speaker speech synthesis with PyTorch☆24Updated 3 years ago
- Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech☆22Updated 2 years ago
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine☆31Updated 11 months ago
- ☆33Updated 2 months ago
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆24Updated last year
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆33Updated last week
- with alignment learning and continuous wavelet transform☆19Updated 2 years ago
- ☆25Updated last year