nethermanpro/transvip

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nethermanpro/transvip)

nethermanpro / transvip

☆164

Alternatives and similar repositories for transvip

Users that are interested in transvip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

openaudiolab / LLaST
View on GitHub
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
☆26Aug 11, 2024Updated last year
fyvo / WMT-Biomed-Test
View on GitHub
☆13Aug 23, 2024Updated last year
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
sarapapi / hearing2translate
View on GitHub
A unified evaluation suite for speech-to-text translation, covering SpeechLLMs, SFMs, and cascaded systems across diverse real-world spee…
☆32Apr 25, 2026Updated 2 months ago
3loi / NaturalVoices
View on GitHub
☆61Oct 22, 2025Updated 8 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
lmxue / ICASSP2022_TTS_VC_Summary
View on GitHub
ICASSP2022 TTS&VC Summary
☆13Jun 9, 2022Updated 4 years ago
wenet-e2e / wesr
View on GitHub
We Speech Transcript based on LLM, in 300 lines of code.
☆182Jun 20, 2025Updated last year
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
DataoceanAI / Dolphin
View on GitHub
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
☆772Jun 11, 2026Updated last month
baichuan-inc / Baichuan-Audio
View on GitHub
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
☆223Feb 28, 2025Updated last year
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆112Apr 1, 2024Updated 2 years ago
kadirnar / fast-dacvae
View on GitHub
☆20Mar 17, 2026Updated 4 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
walker-hyf / ECSS
View on GitHub
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
☆59Jun 20, 2024Updated 2 years ago
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
lugan113 / SynTTS-Commands-Official
View on GitHub
SynTTS-Commands is a large-scale, multilingual (English & Chinese) synthetic speech command dataset designed for low-power Keyword Spotti…
☆17Feb 5, 2026Updated 5 months ago
chentuochao / Spatial-Speech-Translation
View on GitHub
The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"
☆74Aug 15, 2025Updated 11 months ago
mbzuai-nlp / sttatts
View on GitHub
☆31Oct 29, 2024Updated last year
maitrix-org / Voila
View on GitHub
☆496May 6, 2025Updated last year
walker-hyf / GPT-Talker
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆78Nov 1, 2024Updated last year
zy-du / Disentanglement-of-Emotional-Style-and-Speaker-Identity-for-Expressive-Voice-Conversion
View on GitHub
This is the implementation our Interspeech 2022 paper " Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conv…
☆21Sep 18, 2023Updated 2 years ago
adelacvg / NS2VC
View on GitHub
Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech
☆236Feb 29, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
luotianze666 / WaveFM
View on GitHub
[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
☆133Apr 8, 2026Updated 3 months ago
yunyikristy / skipNet
View on GitHub
☆12Oct 21, 2019Updated 6 years ago
sarulab-speech / UTMOSv2
View on GitHub
UTokyo-SaruLab MOS Prediction System
☆350Apr 2, 2026Updated 3 months ago
arnabdas8901 / StarGAN-VC_PlusPlus
View on GitHub
☆11Aug 11, 2023Updated 2 years ago
ScottishFold007 / TTSAudioNormalizer
View on GitHub
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…
☆112Dec 20, 2024Updated last year
skit-ai / woc-tts-enhancement
View on GitHub
This is a winter of code project aimed at speech enhancement of text to speech models.
☆25Feb 6, 2022Updated 4 years ago
pengzhendong / torchfa
View on GitHub
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆61Sep 5, 2025Updated 10 months ago
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
lewangdev / CosyVoice
View on GitHub
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
☆13Jul 15, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
VITA-MLLM / VITA
View on GitHub
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
☆2,520Mar 28, 2025Updated last year
ictnlp / StreamSpeech
View on GitHub
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
☆1,277Jun 29, 2025Updated last year
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆34Sep 25, 2025Updated 9 months ago
ffxiong / uaspeech
View on GitHub
Baseline kaldi script for UA-SPEECH corpus
☆32Oct 16, 2024Updated last year
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
Plachtaa / FAcodec
View on GitHub
Training code for FAcodec presented in NaturalSpeech3
☆244Aug 26, 2024Updated last year
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago