Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"
☆36Feb 10, 2026Updated 2 months ago
Alternatives and similar repositories for wav2tok
Users that are interested in wav2tok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Jan 20, 2025Updated last year
- "Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification" ISMIR2025☆35Sep 11, 2025Updated 7 months ago
- ☆19Feb 2, 2023Updated 3 years ago
- The ArtificialSongGenerator automatically composes and compiles the Artifical Audio Multitrack dataset (AAM).☆27Nov 17, 2025Updated 5 months ago
- Implementation of Multi-Source Music Generation with Latent Diffusion.☆27Sep 12, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆253Feb 14, 2024Updated 2 years ago
- Neural Network Audio FingerPrint☆63Mar 5, 2023Updated 3 years ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating r…☆12Nov 30, 2021Updated 4 years ago
- ☆57Feb 12, 2026Updated 2 months ago
- Official repo for DisCoder: High-Fidelity Music Vocoder using Neural Audio Codecs presented at ICASSP 2025☆41Feb 24, 2025Updated last year
- ☆12Mar 11, 2025Updated last year
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆76Aug 30, 2021Updated 4 years ago
- The official code repository for SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Tran…☆155Dec 8, 2025Updated 4 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆193May 29, 2024Updated last year
- Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)☆85Dec 3, 2024Updated last year
- Geometry features for block window cover song identification (a continuation of my ISMIR 2015 paper)☆24Jul 6, 2023Updated 2 years ago
- LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation☆80Feb 24, 2021Updated 5 years ago
- music semantic understanding evaluation benchmark☆25Aug 12, 2023Updated 2 years ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆100Jul 24, 2024Updated last year
- Unofficial implementation JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation(https://arxiv.org/abs/2310.1…☆32Jan 19, 2024Updated 2 years ago
- ☆51Mar 5, 2026Updated last month
- The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"☆65Mar 5, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)☆41Mar 10, 2025Updated last year
- ISMIR 24 Supplementary Material☆14Oct 28, 2024Updated last year
- ☆18Jun 24, 2025Updated 9 months ago
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Sep 16, 2022Updated 3 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆90Oct 11, 2024Updated last year
- List of academic resources on Multimodal ML for Music☆298Mar 25, 2023Updated 3 years ago
- Official PyTorch implementation for "Towards Lightweight Controllable Audio Synthesis with Conditional Implicit Neural Representations".☆21Dec 3, 2021Updated 4 years ago
- ☆13Jul 14, 2024Updated last year
- Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation☆140Mar 8, 2026Updated last month
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- An open-source Kazakh Emotional Text-to-Speech Dataset☆36Aug 1, 2025Updated 8 months ago
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 3 years ago
- Unofficial implementation of NANSY++ in Pytorch Lightning☆50Mar 11, 2024Updated 2 years ago
- ☆32Nov 25, 2023Updated 2 years ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆64Nov 5, 2025Updated 5 months ago
- Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…☆55Jan 18, 2024Updated 2 years ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year