Madhuvod / VoxLinguaLinks
A Model (maybe an app) that translates the audio of a video from one language to another language, cloning the voice of original video with the translated audio
☆11Updated last month
Alternatives and similar repositories for VoxLingua
Users that are interested in VoxLingua are comparing it to the libraries listed below
Sorting:
- MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation☆13Updated 3 months ago
- ☆11Updated 2 years ago
- MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection☆9Updated 9 months ago
- Onset-and-Offset-Aware Sound Event Detection☆17Updated 5 months ago
- A semi-supervised sequence-to-sequence ASR☆10Updated 2 years ago
- Codebase for "Transcription free filler word detection with Neural semi-CRFs" [ICASSP2023]☆8Updated last year
- Voice activity detection and speaker gender segmentation audiovisual corpus☆15Updated 5 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- ☆11Updated 4 months ago
- Supervoice diffusion enhance☆27Updated last year
- Code and Resources for "LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study", introducing methods to leverage LLMs for G…☆11Updated last month
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Updated 3 months ago
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆14Updated 7 months ago
- ☆14Updated last year
- ☆16Updated 3 months ago
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆19Updated last month
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆12Updated 7 months ago
- Sing any popular song with your voice☆11Updated 3 years ago
- offical code for Dense-TSNet☆12Updated 9 months ago
- text to speech☆10Updated last year
- ☆9Updated 5 years ago
- 来自于文章Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition☆9Updated 7 months ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆16Updated last week
- Code for ICML25 Paper "Overcoming Non-monotonicity in Transducer-based Streaming Generation"☆11Updated last month
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆21Updated last year
- [INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…☆14Updated 11 months ago
- ☆10Updated 8 months ago
- ☆11Updated last year
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated 10 months ago
- ☆13Updated 8 months ago