☆24Jan 6, 2023Updated 3 years ago
Alternatives and similar repositories for voicebox
Users that are interested in voicebox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Apr 18, 2023Updated 3 years ago
- ☆11Mar 28, 2021Updated 5 years ago
- ☆20Sep 20, 2024Updated last year
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated 2 years ago
- ☆23Apr 3, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 9 months ago
- Audio-Visual Room Impulse Response Estimation☆23Jul 22, 2024Updated last year
- Prepend universal audio attack segment to mute Whisper☆39Jan 22, 2025Updated last year
- This is the official implementation of reverberant speech to room impulse response estimator☆42Aug 7, 2024Updated last year
- ☆18Mar 13, 2024Updated 2 years ago
- Dataset/code for AudioMarkBench: Benchmarking Robustness of Audio Watermarking☆47Aug 23, 2024Updated last year
- A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based …☆67Aug 24, 2025Updated 9 months ago
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆19May 23, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 使用fastrtc框架调用qwen-2.5-omni-realtime实现实时语音、视频等☆14Jun 27, 2025Updated 11 months ago
- Protect your eyes to see the world!☆11Oct 16, 2021Updated 4 years ago
- ☆33Jun 29, 2023Updated 2 years ago
- ☆12Feb 26, 2023Updated 3 years ago
- A Python training and inference implementation of Yolov5 reflective clothes and helmet detection☆20Dec 2, 2021Updated 4 years ago
- ☆15Feb 24, 2023Updated 3 years ago
- Code repository of the paper "Alleviating Adversarial Attacks on Variational Autoencoders with MCMC" published at NeurIPS 2022. https://a…☆10Dec 14, 2022Updated 3 years ago
- ☆14Feb 1, 2021Updated 5 years ago
- Source code of paper <End-to-End Language Diarization for Bilingual Code-switching Speech>☆19Jan 23, 2022Updated 4 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Code for the AAAI 2023 paper: "Global-Local Characteristic Excited Cross-Modal Attacks from Images to Video" (accepted).☆14Feb 25, 2024Updated 2 years ago
- Psychoacoustic Calibration for Efficient Neural Audio Coding☆26Sep 26, 2023Updated 2 years ago
- ☆11Apr 12, 2024Updated 2 years ago
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆35Aug 27, 2023Updated 2 years ago
- Lightweight Speech Representation Learning for One-Shot Voice Conversion☆23Dec 12, 2024Updated last year
- ☆12Feb 23, 2023Updated 3 years ago
- The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1☆68May 8, 2026Updated 3 weeks ago
- ☆55Mar 2, 2023Updated 3 years ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Target Agnostic Attack on Deep Models: Exploiting Security Vulnerabilities of Transfer Learning☆10Jul 2, 2019Updated 6 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- Official PyTorch implementation of 'VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverb…☆33Feb 23, 2026Updated 3 months ago
- 完全依靠ChatGPT生成数据微调的西式翻译腔聊天风格中文大模型☆21Apr 1, 2024Updated 2 years ago
- Proof of concept code for DeepSteal (SP'22) Machine Learning model extraction (weight stealing) with memory side channel☆15Jun 22, 2023Updated 2 years ago
- Voice conversion model for real-time speech synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.☆29Mar 3, 2022Updated 4 years ago
- The official implementation of the paper "Defending Your Voice: Adversarial Attack on Voice Conversion".☆53May 15, 2024Updated 2 years ago