Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"
☆36Feb 10, 2026Updated 3 weeks ago
Alternatives and similar repositories for wav2tok
Users that are interested in wav2tok are comparing it to the libraries listed below
Sorting:
- ☆17Jan 20, 2025Updated last year
- ☆19Feb 2, 2023Updated 3 years ago
- "Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification" ISMIR2025☆35Sep 11, 2025Updated 5 months ago
- The ArtificialSongGenerator automatically composes and compiles the Artifical Audio Multitrack dataset (AAM).☆27Nov 17, 2025Updated 3 months ago
- Implementation of Multi-Source Music Generation with Latent Diffusion.☆27Sep 12, 2024Updated last year
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- Official repo for DisCoder: High-Fidelity Music Vocoder using Neural Audio Codecs presented at ICASSP 2025☆38Feb 24, 2025Updated last year
- ☆49Feb 12, 2026Updated 3 weeks ago
- Neural Network Audio FingerPrint☆63Mar 5, 2023Updated 3 years ago
- music semantic understanding evaluation benchmark☆25Aug 12, 2023Updated 2 years ago
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Sep 16, 2022Updated 3 years ago
- ☆251Feb 14, 2024Updated 2 years ago
- The official code repository for SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Tran…☆154Dec 8, 2025Updated 3 months ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- ☆17Jun 24, 2025Updated 8 months ago
- ☆51Updated this week
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆76Aug 30, 2021Updated 4 years ago
- Source code for "Learning Similarity Metrics for Melody Retrieval"☆29Oct 29, 2019Updated 6 years ago
- List of Podcast Feeds using iTunes API and script to download 6,000,000~ hours of English speech.☆31Apr 13, 2023Updated 2 years ago
- The source code for the paper XiaoiceSing2 (interspeech2023)☆49Jan 15, 2024Updated 2 years ago
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 2 years ago
- This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating r…☆12Nov 30, 2021Updated 4 years ago
- ☆11Dec 17, 2025Updated 2 months ago
- My implementation of diffusion (like) models☆11Apr 14, 2023Updated 2 years ago
- This repository provides the materials used in "Unsupervised Melody-to-Lyric Generation" by Yufei Tian, Anjali Narayan-Chen, Shereen Orab…☆11Jul 6, 2023Updated 2 years ago
- LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation☆80Feb 24, 2021Updated 5 years ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆61Jul 1, 2025Updated 8 months ago
- ☆29Jun 8, 2023Updated 2 years ago
- Unofficial implementation of NANSY++ in Pytorch Lightning☆50Mar 11, 2024Updated last year
- Voice conversion model for real-time speech synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.☆29Mar 3, 2022Updated 4 years ago
- ☆12Mar 11, 2025Updated 11 months ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- Video Background Music Generation Using Unpaired Audio-Visual Data☆30Oct 8, 2024Updated last year
- 4G GPU & 10 Minutes for train☆12Aug 9, 2023Updated 2 years ago
- Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)☆85Dec 3, 2024Updated last year
- The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"☆63Updated this week
- A toolkit for any-to-any encoder-decoder voice conversion systems☆84Aug 10, 2023Updated 2 years ago
- ☆32Nov 25, 2023Updated 2 years ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆64Nov 5, 2025Updated 4 months ago