apple/pytorch-speech-features

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/apple/pytorch-speech-features)

apple / pytorch-speech-features

☆87

Alternatives and similar repositories for pytorch-speech-features

Users that are interested in pytorch-speech-features are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LAION-AI / Text-to-speech
View on GitHub
☆61Nov 4, 2023Updated 2 years ago
huutuongtu / Lightvoc
View on GitHub
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18May 17, 2024Updated 2 years ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
ex3ndr / supervoice-librilight-preprocessed
View on GitHub
60k hours of phoneme-aligned audio from audio books
☆19Jul 27, 2024Updated last year
Alexander-H-Liu / dinosr
View on GitHub
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
☆53Jan 18, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
apple / ml-ogen
View on GitHub
☆13Apr 7, 2024Updated 2 years ago
lucadellalib / discrete-wavlm-codec
View on GitHub
A neural speech codec based on discrete WavLM representations
☆26Aug 28, 2024Updated last year
lifeiteng / TTS-TextAnalyzer
View on GitHub
TTS Text Analyzer
☆31Jul 20, 2023Updated 3 years ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
utter-project / mHuBERT-147-scripts
View on GitHub
Collection of scripts from mHuBERT-147.
☆35Nov 19, 2024Updated last year
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
ex3ndr / supervoice-gpt
View on GitHub
GPT-style network for phonemization with durations of text
☆68Mar 21, 2024Updated 2 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
apple / ml-reed
View on GitHub
☆13Feb 5, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
pyf98 / DPHuBERT
View on GitHub
INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"
☆118Jan 26, 2024Updated 2 years ago
apple / ml-vfm-kt
View on GitHub
☆14Jul 2, 2024Updated 2 years ago
tabahi / contexless-phonemes-CUPE
View on GitHub
pytorch model for contexless-phoneme prediction from speech audio
☆32Oct 30, 2025Updated 8 months ago
AI-Unicamp / TTS-Objective-Metrics
View on GitHub
Objective metrics used in several text-to-speech (TTS) papers.
☆54Jun 17, 2025Updated last year
IDEA-Emdoor-Lab / DistilCodec
View on GitHub
A Neural Audio Codec (NAC) for Universal Audio
☆47May 30, 2025Updated last year
lifeiteng / naturalspeech3_facodec
View on GitHub
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆254Apr 20, 2024Updated 2 years ago
maxrmorrison / torbi
View on GitHub
Viterbi decoding in PyTorch
☆42May 5, 2026Updated 2 months ago
ex3ndr / supervoice-enhance
View on GitHub
Supervoice diffusion enhance
☆28Jul 15, 2024Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
meaningTeam / tidy-tunes
View on GitHub
Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …
☆23May 19, 2026Updated 2 months ago
AbrahamSanders / codec-bpe
View on GitHub
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆76Dec 3, 2025Updated 7 months ago
bfs18 / e2_tts
View on GitHub
☆69Sep 3, 2024Updated last year
hayeong0 / Diff-HierVC
View on GitHub
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Pr…
☆237Jul 3, 2024Updated 2 years ago
DigitalPhonetics / VoicePAT
View on GitHub
VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.
☆59May 14, 2024Updated 2 years ago
Andong-Li-speech / BridgeVoC
View on GitHub
This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".
☆67Nov 5, 2025Updated 8 months ago
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
Respaired / RiFornet_Vocoder
View on GitHub
a Neural Vocoder supporting Ring Attention, Conformer and NSF.
☆25Aug 1, 2025Updated 11 months ago
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆196Jul 12, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
zjwang21 / mix-phoneme-bert
View on GitHub
An unofficial PyTorch implementation of Mix-Phoneme-Bert
☆40Jul 10, 2023Updated 3 years ago
X-LANCE / VoiceFlow-TTS
View on GitHub
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
☆376Sep 3, 2024Updated last year
sony / bigvsan
View on GitHub
Pytorch implementation of BigVSAN
☆203Dec 9, 2025Updated 7 months ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
lmaxwell / McHuo
View on GitHub
A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes
☆12Oct 19, 2023Updated 2 years ago
nii-yamagishilab / SpeechSPC-mini
View on GitHub
Speech Security and Privacy Compendium - Mini
☆10Jun 18, 2024Updated 2 years ago
apple / ml-lucid-datagen
View on GitHub
☆31Mar 4, 2024Updated 2 years ago