FunAudioLLM/ThinkSound

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FunAudioLLM/ThinkSound)

FunAudioLLM / ThinkSound

[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.

☆1,369

Alternatives and similar repositories for ThinkSound

Users that are interested in ThinkSound are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liuhuadai / OmniAudio
View on GitHub
[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"
☆374Jun 27, 2025Updated last year
PKU-Alignment / align-anything
View on GitHub
Align Anything: Training All-modality Model with Feedback
☆4,662Nov 27, 2025Updated 7 months ago
risesoft-y9 / Digital-Infrastructure
View on GitHub
数字底座是一款面向大型政府、企业数字化转型，基于身份认证、组织架构、岗位职务、应用系统、资源角色、数据目录、安全控制等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式，具备微服务、多租户、容器化和国产化，支持用户利用代码生成器快速构建自己的业务应用，同时可关联诸…
☆2,597Updated this week
Klavis-AI / klavis
View on GitHub
Klavis AI: MCP integration platforms that let AI agents use tools reliably at any scale
☆5,763Jun 1, 2026Updated last month
fudan-generative-vision / hallo2
View on GitHub
[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
☆3,714Feb 27, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
FunAudioLLM / FunMusic
View on GitHub
A fundamental toolkit designed for music, song, and audio generation
☆1,362May 20, 2025Updated last year
hyperai / tvm-cn
View on GitHub
TVM Documentation in Chinese Simplified / TVM 中文文档
☆3,818May 20, 2026Updated last month
TJU-DRL-LAB / AI-Optimizer
View on GitHub
The next generation deep reinforcement learning tookit
☆3,463Jun 16, 2023Updated 3 years ago
fudan-generative-vision / hallo
View on GitHub
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
☆8,655Sep 14, 2024Updated last year
EvilGenius-dot / RustMinerSystem
View on GitHub
💰唯一正版💰 minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy 矿池抽水矿池代理矿池中转矿池抽…
☆3,873Updated this week
Everlyn-Labs / Everlyn-1
View on GitHub
The first open autoregressive foundational video AI model.
☆2,892Oct 14, 2024Updated last year
lmxue / Audio-FLAN
View on GitHub
Audio-FLAN
☆161Sep 23, 2025Updated 9 months ago
Docta-ai / docta
View on GitHub
A Doctor for your data
☆3,481Jun 16, 2026Updated 2 weeks ago
Text-to-Audio / AudioLCM
View on GitHub
PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.
☆1,162Jul 1, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hkchengrex / MMAudio
View on GitHub
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
☆2,221Feb 23, 2026Updated 4 months ago
FxPool / FXMinerProxy
View on GitHub
🔥minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,矿池抽水,矿池中转,矿场运维专用
☆3,707May 22, 2026Updated last month
Hunyuan-PromptEnhancer / PromptEnhancer
View on GitHub
[CVPR 2026] PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.
☆3,714Jun 10, 2026Updated 3 weeks ago
xzf-thu / Audio-Reasoner
View on GitHub
The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.
☆297May 15, 2025Updated last year
WuKongOpenSource / WukongCRM-11.0-JAVA
View on GitHub
悟空CRM-基于Spring Cloud Alibaba微服务架构 +vue ElementUI的前后端分离CRM系统
☆2,428Aug 27, 2021Updated 4 years ago
qualcomm / GenieX
View on GitHub
Run frontier LLMs and VLMs locally on Qualcomm devices across NPU, GPU, and CPU with a few lines of code
☆8,128Updated this week
ZeyueT / AudioX
View on GitHub
[ICLR 2026] Repository of AudioX
☆1,535Mar 10, 2026Updated 3 months ago
SkyworkAI / Skywork-R1V
View on GitHub
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.
☆3,159Dec 15, 2025Updated 6 months ago
microsoft / UFO
View on GitHub
UFO³: Weaving the Digital Agent Galaxy
☆9,176Jun 26, 2026Updated last week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
NVIDIA / audio-flamingo
View on GitHub
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
☆1,146Dec 15, 2025Updated 6 months ago
juggleim / im-server
View on GitHub
A high-performance IM server.
☆3,582Updated this week
lakesoul-io / LakeSoul
View on GitHub
LakeSoul is an end-to-end, realtime cloud-native Lakehouse framework for fast data ingestion, concurrent updates, incremental analytics, …
☆3,240Jun 26, 2026Updated last week
nesaorg / nesa
View on GitHub
Run AI models end-to-end encrypted.
☆3,167Feb 10, 2025Updated last year
hitsz-ids / synthetic-data-generator
View on GitHub
SDG is a specialized framework designed to generate high-quality structured tabular data.
☆2,423May 25, 2026Updated last month
haidog-yaqub / EzAudio
View on GitHub
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
☆331Dec 17, 2025Updated 6 months ago
ace-step / ACE-Step
View on GitHub
ACE-Step: A Step Towards Music Generation Foundation Model
☆4,609Feb 15, 2026Updated 4 months ago
XiaomiMiMo / MiMo-Audio
View on GitHub
MiMo-Audio: Audio Language Models are Few-Shot Learners
☆1,056Jun 17, 2026Updated 2 weeks ago
TaskingAI / TaskingAI
View on GitHub
The open source platform for AI-native application development.
☆5,389Dec 2, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ddlBoJack / Omni-Captioner
View on GitHub
[ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.
☆138Apr 7, 2026Updated 2 months ago
Text-to-Audio / Make-An-Audio-3
View on GitHub
Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers
☆121May 19, 2025Updated last year
ModelEngine-Group / fit-framework
View on GitHub
FIT: 企业级AI开发框架，提供多语言函数引擎（FIT）、流式编排引擎（WaterFlow）及Java生态的LangChain替代方案（FEL）。原生/Spring双模运行，支持插件热插拔与智能聚散部署，无缝统一大模型与业务系统。
☆2,109Mar 13, 2026Updated 3 months ago
dgiot / dgiot
View on GitHub
Open source platform for iot , 6 min Quick Deployment,10M devices connection,Carrier level Stability;物联网开源平台,6分钟快速部署,千万级承载,电信级稳定性. Low co…
☆4,824Apr 10, 2025Updated last year
facebookresearch / vggt
View on GitHub
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
☆13,540May 19, 2026Updated last month
om-ai-lab / OmAgent
View on GitHub
[EMNLP-2024] Build multimodal language agents for fast prototype and production
☆2,660Mar 19, 2025Updated last year
ASLP-lab / DiffRhythm
View on GitHub
Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion
☆2,312Nov 27, 2025Updated 7 months ago