Data manipulation and transformation for audio signal processing, powered by PyTorch
☆11Sep 30, 2024Updated last year
Alternatives and similar repositories for audio
Users that are interested in audio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 9 months ago
- ☆16Nov 9, 2023Updated 2 years ago
- faster inference☆28Jan 20, 2025Updated last year
- ☆36Sep 6, 2025Updated 8 months ago
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 10 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆39Sep 9, 2025Updated 8 months ago
- ☆25Jun 19, 2025Updated 11 months ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated last year
- One command to build TLG.fst for WeNet.☆30Oct 11, 2022Updated 3 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆17May 16, 2025Updated last year
- Hpyformer base FunASR☆30Nov 5, 2024Updated last year
- PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture☆26Jun 12, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- ☆55Jul 1, 2024Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 11 months ago
- 基于 Sherpa-ONNX 实现在线下载模型的端侧实时语音识别应用(Implement speech recognition based on Sherpa-ONNX by downloading the model online.)☆29Feb 27, 2025Updated last year
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- Training code for kokoro tts model☆41Nov 15, 2025Updated 6 months ago
- Unsupervised word segmentation and clustering of speech☆13Feb 17, 2017Updated 9 years ago
- A neural speech codec based on discrete WavLM representations☆26Aug 28, 2024Updated last year
- ☆40Jul 15, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆21May 20, 2025Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Forced alignment decoder for Whisper.☆16Mar 13, 2024Updated 2 years ago
- Chinese and English Bilinguish G2P☆22Jul 16, 2023Updated 2 years ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆53Sep 20, 2025Updated 8 months ago
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆27Mar 21, 2025Updated last year
- In-car multi-channel speech transcription system of AISHELL-5.☆44Jun 9, 2025Updated 11 months ago
- ☆15Apr 16, 2026Updated last month
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆36Jun 6, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…☆18Jan 15, 2026Updated 4 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆36May 7, 2025Updated last year
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆26Feb 1, 2026Updated 3 months ago
- ☆10Jun 11, 2024Updated last year
- ☆13Aug 24, 2023Updated 2 years ago