vlarine/wav2vec

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/vlarine/wav2vec)

vlarine / wav2vec

vq-wav2vec inference

☆15

Alternatives and similar repositories for wav2vec

Users that are interested in wav2vec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RicherMans / SpokenLanguageClassifiers
View on GitHub
Pretrained spoken language classifiers from audio.
☆10Jan 21, 2021Updated 5 years ago
azraelkuan / repgan
View on GitHub
RepVgg + HiFiGAN
☆36Aug 10, 2022Updated 3 years ago
lifeiteng / TTS-TextAnalyzer
View on GitHub
TTS Text Analyzer
☆31Jul 20, 2023Updated 3 years ago
Yangyangii / TPGST-Tacotron
View on GitHub
Google's TPGST reimplementation.
☆34Dec 11, 2019Updated 6 years ago
cpdu / unicats
View on GitHub
☆63Jan 15, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LAION-AI / emotional-speech-annotations
View on GitHub
This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
☆35Oct 13, 2024Updated last year
XinyuZhou2000 / Spoken-Dialogue
View on GitHub
☆18Dec 7, 2023Updated 2 years ago
X-LANCE / UniCATS-CTX-txt2vec
View on GitHub
[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS
☆64Nov 18, 2024Updated last year
AbrahamSanders / codec-bpe
View on GitHub
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆76Dec 3, 2025Updated 7 months ago
X-LANCE / public_talks
View on GitHub
Materials of public talks given By SJTU X-LANCE members
☆14Dec 3, 2022Updated 3 years ago
espnet / icassp2020-tts
View on GitHub
ESPnet-TTS Audio Sample HP
☆21Oct 25, 2019Updated 6 years ago
cpdu / vallt
View on GitHub
☆36Mar 14, 2025Updated last year
ishine / PnG-BERT
View on GitHub
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
☆24Jan 29, 2022Updated 4 years ago
y-ren16 / OV-InstructTTS
View on GitHub
☆22Jan 27, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HawkAaron / MiniDFS
View on GitHub
Python Implementation of Mini DFS
☆15Jun 24, 2018Updated 8 years ago
Jackson-Kang / VQVC-Pytorch
View on GitHub
An unofficial implementation of Vector Quantization Voice Conversion (VQVC).
☆29Apr 12, 2021Updated 5 years ago
zzw922cn / wesinger2
View on GitHub
Synthesized singing voice demos of WeSinger 2 paper.
☆26Feb 20, 2023Updated 3 years ago
lifeiteng / VoiceBox
View on GitHub
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
☆29Aug 4, 2023Updated 2 years ago
YoungSeng / ReprGesture
View on GitHub
The ReprGesture entry to the GENEA Challenge 2022 (IMCI 2022)
☆16Nov 8, 2022Updated 3 years ago
yesheng-THU / GFGE
View on GitHub
GFGE
☆15Sep 7, 2022Updated 3 years ago
BridgetteSong / ExpressiveTacotron
View on GitHub
This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…
☆74Sep 21, 2022Updated 3 years ago
hvoss-tech / AQGT
View on GitHub
Official Implementation of AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture Synthesis with the extension (…
☆21Apr 19, 2024Updated 2 years ago
gli-27 / voca-pytorch
View on GitHub
☆10Jan 5, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
renqianluo / GBDT-NAS
View on GitHub
GBDT-NAS
☆28Oct 1, 2021Updated 4 years ago
zzw922cn / LPC_for_TTS
View on GitHub
Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.
☆72Mar 19, 2021Updated 5 years ago
lifeiteng / NaturalSpeech2
View on GitHub
☆33Jun 29, 2023Updated 3 years ago
zzw922cn / TF2_soft_dtw
View on GitHub
Custom TensorFlow2 implementations of forward and backward computation of soft-DTW algorithm in batch mode.
☆20Jun 7, 2021Updated 5 years ago
bshall / urhythmic
View on GitHub
Unsupervised Rhythm Modeling for Voice Conversion
☆85Aug 3, 2023Updated 2 years ago
adelacvg / diff-vits
View on GitHub
☆39Oct 1, 2023Updated 2 years ago
rishikksh20 / iSTFTNet-pytorch
View on GitHub
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
☆277Jul 15, 2025Updated last year
rishikksh20 / MiniMax-TTS-pytorch
View on GitHub
Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report
☆47Sep 2, 2025Updated 10 months ago
SeventeenChen / Python_Speech_SZY
View on GitHub
宋知用《MATLAB在语音信号分析与合成中的应用》 Python版
☆37Jan 5, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
sigal-raab / Motion
View on GitHub
Motion classes, based on Holden's code http://theorangeduck.com/page/deep-learning-framework-character-motion-synthesis-and-editing
☆26Apr 23, 2026Updated 2 months ago
npuichigo / tarzan
View on GitHub
High-level API for tar-based dataset
☆12Feb 3, 2024Updated 2 years ago
IsraelCohenLab / ConstantBeamwidthUCCA
View on GitHub
☆11Jun 6, 2022Updated 4 years ago
0nutation / USLM
View on GitHub
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
☆152Sep 14, 2023Updated 2 years ago
jinhan / tacotron2-gst
View on GitHub
Tacotron2 with Global Style Tokens
☆64Apr 19, 2019Updated 7 years ago
Tedgoo / Partial-Discharge-SVM
View on GitHub
Classification learning was performed using the SVM model according to the type of partial discharge signal. (The data accumulated for 1 …
☆10Jan 16, 2022Updated 4 years ago
ga642381 / SpeechGen
View on GitHub
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
☆77Jun 9, 2023Updated 3 years ago