jishengpeng/WavTokenizer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jishengpeng/WavTokenizer)

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

☆1,309

Alternatives and similar repositories for WavTokenizer

Users that are interested in WavTokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jishengpeng / Languagecodec
View on GitHub
[ACL 2025 Oral] Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
☆208Jun 25, 2025Updated last year
jishengpeng / ControlSpeech
View on GitHub
[ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
☆276Nov 22, 2024Updated last year
yangdongchao / AcademiCodec
View on GitHub
AcademiCodec: An Open Source Audio Codec Model for Academic Research
☆674Dec 27, 2023Updated 2 years ago
ZhangXInFD / SpeechTokenizer
View on GitHub
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…
☆657Jun 9, 2024Updated 2 years ago
jishengpeng / TextrolSpeech
View on GitHub
[ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
☆187Nov 22, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
shenjunjiekoda / knight
View on GitHub
kight is a static analysis tool for c/c++ programs.
☆213Dec 27, 2024Updated last year
ZivJia / hmi-workspace
View on GitHub
An Workspace for HMI tools
☆163Jul 11, 2024Updated 2 years ago
Aria-K-Alethia / BigCodec
View on GitHub
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
☆218Sep 19, 2024Updated last year
MingXiangL / DEVIL
View on GitHub
Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].
☆274Dec 3, 2024Updated last year
MingXiangL / AttentionShift
View on GitHub
Official Implementation of AttentionShift: Iteratively Estimated Part-based Attention Map for Pointly Supervised Instance Segmentation
☆155Oct 18, 2024Updated last year
AaronZ345 / StyleSinger
View on GitHub
PyTorch Implementation of StyleSinger(AAAI 2024): Style Transfer for Out-of-Domain Singing Voice Synthesis
☆420Aug 15, 2025Updated 11 months ago
Falling-dow / Unsupervised-Image-Enhancement-with-CNN-and-GAN
View on GitHub
Advanced Unsupervised Image Enhancement with GAN
☆247Nov 11, 2024Updated last year
Rhythm-Byte / SchemaDiff
View on GitHub
☆246Nov 24, 2024Updated last year
wenlongliaoEE / ETDToolbox
View on GitHub
☆175Feb 21, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
AaronZ345 / TCSinger
View on GitHub
PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
☆386Oct 7, 2025Updated 9 months ago
PeiranLi0930 / L-SVD
View on GitHub
Large-Scale Selfie Video Dataset (L-SVD): A Benchmark for Emotion Recognition
☆306Aug 18, 2024Updated last year
SiyangLi99 / open-alteryx-macro
View on GitHub
Welcome to the 'Open-Alteryx-Macro' project. This project is aimed at providing an open-source solution for managing and updating Alteryx…
☆156May 25, 2024Updated 2 years ago
banggx / morgana-form
View on GitHub
莫甘娜问卷表单编辑器，低代码快速搭建表单，AI表单生成，表单数据搜集统计
☆147Jun 21, 2026Updated last month
jtun-coder / JtunRouter
View on GitHub
It is an Android-based application that enables managing hotspot properties through a web interface, providing mobile routing functionali…
☆156Jul 14, 2026Updated 2 weeks ago
zhenye234 / X-Codec-2.0
View on GitHub
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
☆361Jun 25, 2026Updated last month
JinhuaLiang / WavCraft
View on GitHub
Official repo for WavCraft, an AI agent for audio creation and editing
☆523Feb 15, 2025Updated last year
descriptinc / descript-audio-codec
View on GitHub
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
☆1,840Jul 16, 2026Updated last week
elleryqueenhomels / AI_for_Atari
View on GitHub
Deep Reinforcement Learning Algorithms for solving Atari 2600 Games
☆143Mar 23, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
BiuYeaf / A-general-framework-to-Prompt-tuning-LLM-model
View on GitHub
☆141May 8, 2024Updated 2 years ago
Nonac / DDOPaI
View on GitHub
☆120Sep 30, 2024Updated last year
CGCL-codes / YiTu
View on GitHub
YiTu is an easy-to-use runtime to fully exploit the hybrid parallelism of different hardwares (e.g., GPU) to efficiently support the exec…
☆254Jan 7, 2026Updated 6 months ago
jishengpeng / WavChat
View on GitHub
A Survey of Spoken Dialogue Models (60 pages)
☆316Nov 28, 2024Updated last year
zhenye234 / xcodec
View on GitHub
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
☆308Oct 12, 2025Updated 9 months ago
Credit-card-monitoring-and-fraud-check / Credit_card_monitoring_and_check
View on GitHub
A code repository designed to show the best GitHub has to offer.
☆165Jun 30, 2024Updated 2 years ago
orchain / prysm
View on GitHub
☆296Sep 14, 2025Updated 10 months ago
SSSYDYSSS / TransProPy
View on GitHub
A python package that integrate algorithms and various machine learning approaches to extract features (genes) effective for classificati…
☆251Jan 15, 2026Updated 6 months ago
yileijin / Bootstrap-GS
View on GitHub
☆251Feb 11, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
kaitoInfra / fast-twitter-api
View on GitHub
Simple yet powerful Twitter data retrieval SDK with multi-language support.No Limits, No Auth Required
☆183May 28, 2026Updated 2 months ago
RexGRM / Alz-IDProteinExplorer
View on GitHub
Visualization, simulation, manipulation of Intrinsically disorder proteins with Gibbs sampling
☆288Oct 24, 2024Updated last year
gersteinlab / ML-Bench
View on GitHub
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://arxiv.org/abs/2311.098…
☆314Jul 31, 2025Updated 11 months ago
530051970 / auth-hub-demo
View on GitHub
User Identity Scaffolding for Multiple OIDC Authentications for User
☆95Jun 14, 2025Updated last year
sjiang325 / Abdominal-Trauma-Detection-code
View on GitHub
☆134Sep 24, 2024Updated last year
wYaobiz / awesome-self-sovereign-identity
View on GitHub
An awesome list of self-sovereign identity resources.
☆137Jul 9, 2024Updated 2 years ago
SSSYDYSSS / TransProR
View on GitHub
Analysis and visualization of multi-omics data. In ongoing development: multi-modal fusion, sparse learning, and spatio-temporal effects.…
☆206Jan 15, 2026Updated 6 months ago