MiniMax-AI / audio-tools
A collection of optimized utilities for text-to-audio processing, enhancing both training and inference workflows. This repository contains robust implementations adapted from open-source libraries.
☆12Updated 3 weeks ago
Alternatives and similar repositories for audio-tools:
Users that are interested in audio-tools are comparing it to the libraries listed below
- ☆16Updated 3 months ago
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆36Updated 7 months ago
- Incredibly descriptive audiovisual summaries for videos☆40Updated 8 months ago
- ☆12Updated last year
- ☆29Updated last year
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Updated 5 months ago
- A client/server app for real‑time voice chat with AI. Live speech‑to‑text, instant AI replies.☆12Updated this week
- ☆13Updated 8 months ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆23Updated 7 months ago
- ☆16Updated 10 months ago
- ☆27Updated 2 months ago
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆28Updated last year
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆37Updated 11 months ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆35Updated 2 months ago
- ☆30Updated last year
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 8 months ago
- Open source intent recognition framework powered by LLMs.☆18Updated 4 months ago
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆16Updated last year
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated last year
- A minimalistic, hackable code base to finetune Wan video generation model☆39Updated last week
- ☆20Updated 10 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆34Updated last year
- ☆18Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 4 months ago
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆13Updated last week
- An open-source toolkit helping developers build natural language database query solutions☆11Updated last week
- Luann allows you to create a LLM agent,which has complete memory module (long-term memory, short-term memory) and knowledge module(Variou…☆21Updated last month
- Evaluate the Opinion Leadership of LLMs in the Werewolf Game☆9Updated 8 months ago
- ☆37Updated 2 years ago
- Implementation of the premier Text to Video model from OpenAI☆57Updated 5 months ago