madhavlab/wav2tok

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/madhavlab/wav2tok)

madhavlab / wav2tok

Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"

☆36

Alternatives and similar repositories for wav2tok

Users that are interested in wav2tok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

streichgeorg / autosing
View on GitHub
☆18Jan 20, 2025Updated last year
CODEJIN / XiaoiceSing2
View on GitHub
☆19Feb 2, 2023Updated 3 years ago
fabianostermann / ArtificialSongGenerator
View on GitHub
The ArtificialSongGenerator automatically composes and compiles the Artifical Audio Multitrack dataset (AAM).
☆27Nov 17, 2025Updated 8 months ago
XZWY / MSLDM
View on GitHub
Implementation of Multi-Source Music Generation with Latent Diffusion.
☆29Sep 12, 2024Updated last year
Eps-Acoustic-Revolution-Lab / DUO_TOK
View on GitHub
Official repository for “Duo-Tok: Dual-Track Semantic Music Tokenizer for Vocal–Accompaniment Generation.”
☆32Nov 26, 2025Updated 7 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
SonyCSLParis / codicodec
View on GitHub
Encode and decode audio samples to/from continuous and discrete compressed representations!
☆121Nov 25, 2025Updated 7 months ago
stdio2016 / pfann
View on GitHub
Neural Network Audio FingerPrint
☆65Mar 5, 2023Updated 3 years ago
minzwon / musicfm
View on GitHub
☆268Feb 14, 2024Updated 2 years ago
gwh22 / LAFMA
View on GitHub
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)
☆44Jun 13, 2024Updated 2 years ago
EvelynZhou / FAST-RIR
View on GitHub
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating r…
☆12Nov 30, 2021Updated 4 years ago
zhai-lw / L3AC
View on GitHub
A lightweight audio codec based on a single quantizer
☆35Sep 4, 2025Updated 10 months ago
raraz15 / neural-music-fp
View on GitHub
"Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification" ISMIR2025
☆38Sep 11, 2025Updated 10 months ago
k2-fsa / Flow2GAN
View on GitHub
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
☆144Mar 8, 2026Updated 4 months ago
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
rishikksh20 / UnivNet-pytorch
View on GitHub
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
☆76Aug 30, 2021Updated 4 years ago
ilaria-manco / muscaps
View on GitHub
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
☆85Dec 3, 2024Updated last year
bytedance / Make-An-Audio-2
View on GitHub
a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
☆197May 29, 2024Updated 2 years ago
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
ctralie / GeometricCoverSongs
View on GitHub
Geometry features for block window cover song identification (a continuation of my ISMIR 2015 paper)
☆24Jul 6, 2023Updated 3 years ago
wbs2788 / MTM
View on GitHub
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging dif…
☆28Jan 21, 2025Updated last year
crlandsc / torch-l1-snr
View on GitHub
Variations of L1 SNR Loss function for training audio source separation machine learning models
☆45May 1, 2026Updated 2 months ago
zceng / LVCNet
View on GitHub
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
☆80Feb 24, 2021Updated 5 years ago
seungheondoh / msu-benchmark
View on GitHub
music semantic understanding evaluation benchmark
☆24Aug 12, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
habla-liaa / encodecmae
View on GitHub
Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'
☆101Jul 24, 2024Updated last year
0417keito / JEN-1-COMPOSER-pytorch
View on GitHub
Unofficial implementation JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation(https://arxiv.org/abs/2310.1…
☆32Jan 19, 2024Updated 2 years ago
exercise-book-yq / Supercodec
View on GitHub
☆51Mar 5, 2026Updated 4 months ago
PlayVoice / VI-Speaker
View on GitHub
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
☆30Sep 16, 2022Updated 3 years ago
ismir-24-sub / unsupervised_compositional_representations
View on GitHub
ISMIR 24 Supplementary Material
☆14Oct 28, 2024Updated last year
ilya16 / ScorePerformer
View on GitHub
ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)
☆42Mar 10, 2025Updated last year
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
xiaomi-research / diffrhythm2
View on GitHub
☆122Nov 6, 2025Updated 8 months ago
qiuqiangkong / mini_music_tagging
View on GitHub
☆13Jul 14, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
lucidrains / rvq-vae-gpt
View on GitHub
My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation
☆90Oct 11, 2024Updated last year
janzuiderveld / continuous-audio-representations
View on GitHub
Official PyTorch implementation for "Towards Lightweight Controllable Audio Synthesis with Conditional Implicit Neural Representations".
☆21Dec 3, 2021Updated 4 years ago
ilaria-manco / multimodal-ml-music
View on GitHub
List of academic resources on Multimodal ML for Music
☆298Mar 25, 2023Updated 3 years ago
y-chan / hifi-gan-misrnet
View on GitHub
unofficial pytorch implementation of HiFi-GAN with fast MISR.
☆15Mar 21, 2023Updated 3 years ago
MWM-io / nansypp
View on GitHub
Unofficial implementation of NANSY++ in Pytorch Lightning
☆50Mar 11, 2024Updated 2 years ago
IS2AI / KazEmoTTS
View on GitHub
An open-source Kazakh Emotional Text-to-Speech Dataset
☆36Aug 1, 2025Updated 11 months ago
guozixunnicolas / FundamentalMusicEmbedding
View on GitHub
☆32Nov 25, 2023Updated 2 years ago