yanghaha0908/FastHuBERT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yanghaha0908/FastHuBERT)

yanghaha0908 / FastHuBERT

Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning

☆100

Alternatives and similar repositories for FastHuBERT

Users that are interested in FastHuBERT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pyf98 / DPHuBERT
View on GitHub
INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"
☆118Jan 26, 2024Updated 2 years ago
lucadellalib / discrete-wavlm-codec
View on GitHub
A neural speech codec based on discrete WavLM representations
☆26Aug 28, 2024Updated last year
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
X-LANCE / UniCATS-CTX-vec2wav
View on GitHub
[AAAI 2024] Code for CTX-vec2wav in UniCATS
☆130Jun 11, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
0nutation / USLM
View on GitHub
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
☆152Sep 14, 2023Updated 2 years ago
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆125Jun 4, 2025Updated last year
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
voidful / Codec-SUPERB
View on GitHub
Audio Codec Speech processing Universal PERformance Benchmark
☆308Jul 4, 2026Updated 3 weeks ago
ZhangXInFD / SpeechTokenizer
View on GitHub
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…
☆658Jun 9, 2024Updated 2 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
adelacvg / diff-vits
View on GitHub
☆39Oct 1, 2023Updated 2 years ago
lakahaga / dc-comix-tts
View on GitHub
Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer
☆74Aug 21, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
Alexander-H-Liu / dinosr
View on GitHub
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
☆53Jan 18, 2024Updated 2 years ago
e-c-k-e-r / vall-e
View on GitHub
An unofficial PyTorch implementation of VALL-E
☆88Aug 3, 2025Updated 11 months ago
ZhangXInFD / soundstorm-speechtokenizer
View on GitHub
Implementation of SoundStorm built upon SpeechTokenizer.
☆116Nov 2, 2023Updated 2 years ago
innnky / descript-audio-vae
View on GitHub
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
☆92Apr 2, 2024Updated 2 years ago
nervjack2 / MelHuBERT
View on GitHub
Official implementation of MelHuBERT
☆70Feb 21, 2026Updated 5 months ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
yl4579 / StyleTTS-VC
View on GitHub
Official Implementation of StyleTTS-VC
☆200Jan 14, 2025Updated last year
MingjieChen / EasyVC
View on GitHub
A toolkit for any-to-any encoder-decoder voice conversion systems
☆83Aug 10, 2023Updated 2 years ago
ex3ndr / supervoice-gpt
View on GitHub
GPT-style network for phonemization with durations of text
☆68Mar 21, 2024Updated 2 years ago
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆112Apr 1, 2024Updated 2 years ago
mechanicalsea / lighthubert
View on GitHub
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
☆73Sep 26, 2022Updated 3 years ago
the-bird-F / Expressive-Vectors
View on GitHub
[ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis
☆40Dec 24, 2025Updated 7 months ago
shang0712 / HierTTS
View on GitHub
☆47Apr 16, 2023Updated 3 years ago
chorowski-lab / hCPC
View on GitHub
Implementation of multi-level Contrastive Predictive Coding (CPC) methods
☆20Jan 12, 2023Updated 3 years ago
sony / bigvsan
View on GitHub
Pytorch implementation of BigVSAN
☆203Dec 9, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
seastar105 / pflow-encodec
View on GitHub
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
☆77Jul 13, 2026Updated last week
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
k2-fsa / libriheavy
View on GitHub
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
☆220Sep 10, 2024Updated last year
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆120Jun 21, 2026Updated last month
AI-S2-Lab / GPT-Talker
View on GitHub
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆45Oct 28, 2024Updated last year
adelacvg / detail_tts
View on GitHub
All generative model in one for better TTS model
☆74Sep 8, 2024Updated last year
AbrahamSanders / codec-bpe
View on GitHub
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆76Dec 3, 2025Updated 7 months ago