A collection of optimized utilities for text-to-audio processing, enhancing both training and inference workflows. This repository contains robust implementations adapted from open-source libraries.
☆45Apr 1, 2025Updated 11 months ago
Alternatives and similar repositories for audio-tools
Users that are interested in audio-tools are comparing it to the libraries listed below
Sorting:
- Official Repository of Paper: "SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding" (IC…☆65Jan 27, 2026Updated last month
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆33Feb 23, 2026Updated last week
- This repository contains statistics about the AI Infrastructure products.☆17Feb 27, 2025Updated last year
- ☆11Updated this week
- LLM inference in C/C++☆21Mar 22, 2025Updated 11 months ago
- MOSS 003 WebSearchTool: A simple but reliable implementation☆45May 24, 2023Updated 2 years ago
- ☆19Mar 3, 2025Updated last year
- Voice conversion with just linear regression.☆35Sep 25, 2025Updated 5 months ago
- Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations☆62Jan 16, 2025Updated last year
- The official implementation of the DIFFA series for dLLM-based large audio language model☆59Feb 2, 2026Updated last month
- ☆23Updated this week
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 9 months ago
- A tool for summarizing search results and website content using FAISS, LLMs, and the Retrieval-Augmented Generation (RAG) technique.☆30Mar 26, 2025Updated 11 months ago
- JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit☆44May 26, 2025Updated 9 months ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated 2 months ago
- [AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion with one tiny 300M model!☆87Jan 29, 2026Updated last month
- ☆28Dec 4, 2025Updated 2 months ago
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆27Feb 13, 2026Updated 2 weeks ago
- This repository contains a social real-time application for iOS devices. The app leverages Amazon IVS Real-time streaming and uses the Am…☆11Nov 19, 2025Updated 3 months ago
- Powered by Gemini☆48Dec 27, 2023Updated 2 years ago
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆50Sep 2, 2025Updated 6 months ago
- Workflow automation, but you just describe what you want and it happens.☆27Nov 22, 2025Updated 3 months ago
- FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent & VSCode Agent (And other Open Sourced) System Prompts, To…☆11Apr 21, 2025Updated 10 months ago
- ☆11Aug 29, 2025Updated 6 months ago
- OTIS Code☆12Mar 19, 2023Updated 2 years ago
- Official Implementation of GLAP - General Language Audio Pretraining☆61Jan 5, 2026Updated last month
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆44Oct 28, 2024Updated last year
- Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆62Sep 5, 2025Updated 5 months ago
- Trainging, inference, and testing of the SAC speech codec model.☆99Nov 1, 2025Updated 4 months ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 9 months ago
- Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control☆98Feb 18, 2026Updated last week
- rabitq rust implementation☆10Feb 4, 2026Updated 3 weeks ago
- An SSH plugin for Dify☆13Jan 16, 2026Updated last month
- Generate music videos starring yourself.☆11Apr 3, 2025Updated 11 months ago
- ☆10Dec 29, 2023Updated 2 years ago
- ☆10Mar 19, 2024Updated last year
- Python Telegraph api.☆15Mar 22, 2025Updated 11 months ago