[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.
☆1,366Apr 3, 2026Updated 2 months ago
Alternatives and similar repositories for ThinkSound
Users that are interested in ThinkSound are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"☆373Jun 27, 2025Updated 11 months ago
- Align Anything: Training All-modality Model with Feedback☆4,658Nov 27, 2025Updated 6 months ago
- 数字底座是一款面向大型政府、企业数字化转型,基于身份认证、组织架 构、岗位职务、应用系统、资源角色、数据目录、安全控制等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式,具备微服务、多租户、容器化和国产化,支持用户利用代码生成器快速构建自己的业务应用,同时可关联诸…☆2,594Updated this week
- Klavis AI: MCP integration platforms that let AI agents use tools reliably at any scale☆5,747Jun 1, 2026Updated last week
- [ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation☆3,704Feb 27, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A fundamental toolkit designed for music, song, and audio generation☆1,355May 20, 2025Updated last year
- TVM Documentation in Chinese Simplified / TVM 中文文档☆3,791May 20, 2026Updated 3 weeks ago
- The next generation deep reinforcement learning tookit☆3,463Jun 16, 2023Updated 2 years ago
- Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation☆8,647Sep 14, 2024Updated last year
- 💰唯一正版💰 minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy 矿池抽水 矿池代理 矿池中转 矿池抽…☆3,872Updated this week
- The first open autoregressive foundational video AI model.☆2,892Oct 14, 2024Updated last year
- Audio-FLAN☆161Sep 23, 2025Updated 8 months ago
- A Doctor for your data☆3,482Jan 14, 2025Updated last year
- PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.☆1,162Jul 1, 2025Updated 11 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis☆2,198Feb 23, 2026Updated 3 months ago
- 🔥minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,矿池抽水,矿池中转,矿场运维专用☆3,632May 22, 2026Updated 3 weeks ago
- [CVPR 2026] PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.☆3,698May 18, 2026Updated 3 weeks ago
- The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.☆297May 15, 2025Updated last year
- 悟空CRM-基于Spring Cloud Alibaba微服务架构 +vue ElementUI的前后端分离CRM系统☆2,424Aug 27, 2021Updated 4 years ago
- Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mo…☆8,094Updated this week
- Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.☆3,158Dec 15, 2025Updated 5 months ago
- [ICLR 2026] Repository of AudioX☆1,524Mar 10, 2026Updated 3 months ago
- UFO³: Weaving the Digital Agent Galaxy☆8,936Jun 6, 2026Updated last week
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models☆1,140Dec 15, 2025Updated 5 months ago
- A high-performance IM server.☆3,576Updated this week
- LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data…☆3,238Updated this week
- Run AI models end-to-end encrypted.☆3,154Feb 10, 2025Updated last year
- SDG is a specialized framework designed to generate high-quality structured tabular data.☆2,421May 25, 2026Updated 2 weeks ago
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆331Dec 17, 2025Updated 5 months ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆4,565Feb 15, 2026Updated 3 months ago
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆1,045Mar 3, 2026Updated 3 months ago
- The open source platform for AI-native application development.☆5,381Dec 2, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.☆136Apr 7, 2026Updated 2 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆121May 19, 2025Updated last year
- FIT: 企业级AI开发框架,提供多语言函数引擎(FIT)、流式编排引擎(WaterFlow)及Java生态的LangChain替代方案(FEL)。原生/Spring双模运行,支持插件热插拔与智能聚散部署,无缝统一大模型与业务系统。☆2,110Mar 13, 2026Updated 3 months ago
- Open source platform for iot , 6 min Quick Deployment,10M devices connection,Carrier level Stability;物联网开源平台,6分钟快速部署,千万级承载,电信级稳定性. Low co…☆4,822Apr 10, 2025Updated last year
- The official code repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment☆1,654Mar 12, 2026Updated 3 months ago
- [EMNLP-2024] Build multimodal language agents for fast prototype and production☆2,649Mar 19, 2025Updated last year
- [CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer☆13,326May 19, 2026Updated 3 weeks ago