Madhuvod / VoxLinguaLinks
A Model (maybe an app) that translates the audio of a video from one language to another language, cloning the voice of original video with the translated audio
☆14Updated 4 months ago
Alternatives and similar repositories for VoxLingua
Users that are interested in VoxLingua are comparing it to the libraries listed below
Sorting:
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Updated 2 months ago
- This repository includes training, inference, evaluation, and utility scripts developed for fine-tuning the Whisper medium.en model on Ai…☆18Updated 11 months ago
- specifications and documentation for the Open Voice Interoperability Initiative Project☆19Updated 3 weeks ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 4 months ago
- Text-to-Speech Latency Benchmark☆18Updated 3 months ago
- Transfer learning approach to pronunciation scoring☆10Updated last year
- proof of concept conversation orchestrator with a speech-language model☆20Updated 11 months ago
- Code and Resources for "LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study", introducing methods to leverage LLMs for G…☆12Updated 4 months ago
- Simple audio AE☆12Updated 10 months ago
- Supervoice diffusion enhance☆27Updated last year
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Updated last year
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Updated 8 months ago
- a simple system for 2-way interruptible voice interactions between human and LLM☆30Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Updated last year
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆19Updated 3 weeks ago
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆20Updated 3 months ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆17Updated 3 weeks ago
- Dippy Synthetic Speech Subnet☆17Updated 2 weeks ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated 2 years ago
- ☆22Updated 2 months ago
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆21Updated last month
- [INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…☆14Updated last year
- A lightweight audio codec based on a single quantizer☆24Updated 3 weeks ago
- Open TTS models, built for streaming on the edge☆43Updated 6 months ago
- Official PyTorch implementation of "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis…☆15Updated 6 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆15Updated last year
- ☆11Updated 10 months ago
- Forced alignment decoder for Whisper.☆14Updated last year
- ☆17Updated 6 months ago
- Sing any popular song with your voice☆11Updated 3 years ago