Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
Alternatives and similar repositories for audio
Users that are interested in audio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 7 months ago
- ☆16Nov 9, 2023Updated 2 years ago
- faster inference☆28Jan 20, 2025Updated last year
- ☆36Sep 6, 2025Updated 6 months ago
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆36Sep 9, 2025Updated 6 months ago
- ☆25Jun 19, 2025Updated 9 months ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- One command to build TLG.fst for WeNet.☆30Oct 11, 2022Updated 3 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated 10 months ago
- Hpyformer base FunASR☆30Nov 5, 2024Updated last year
- PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture☆26Jun 12, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆54Jul 1, 2024Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 9 months ago
- 基于 Sherpa-ONNX 实现在线下载模型的端侧实时语音识别应用(Implement speech recognition based on Sherpa-ONNX by downloading the model online.)☆28Feb 27, 2025Updated last year
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- Training code for kokoro tts model☆37Nov 15, 2025Updated 4 months ago
- Unsupervised word segmentation and clustering of speech☆13Feb 17, 2017Updated 9 years ago
- A neural speech codec based on discrete WavLM representations☆26Aug 28, 2024Updated last year
- ☆40Jul 15, 2025Updated 8 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20May 20, 2025Updated 10 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Forced alignment decoder for Whisper.☆15Mar 13, 2024Updated 2 years ago
- Chinese and English Bilinguish G2P☆22Jul 16, 2023Updated 2 years ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆51Sep 20, 2025Updated 6 months ago
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆27Mar 21, 2025Updated last year
- In-car multi-channel speech transcription system of AISHELL-5.☆42Jun 9, 2025Updated 9 months ago
- ☆15Aug 22, 2025Updated 7 months ago
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆36Jun 6, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…☆18Jan 15, 2026Updated 2 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 11 months ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆36May 7, 2025Updated 10 months ago
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆24Feb 1, 2026Updated last month
- ☆24Sep 20, 2024Updated last year
- ☆13Aug 24, 2023Updated 2 years ago