yujxx / PodAgent
PodAgent: A Comprehensive Framework for Podcast Generation
☆75Updated 2 weeks ago
Alternatives and similar repositories for PodAgent:
Users that are interested in PodAgent are comparing it to the libraries listed below
- ☆150Updated 2 months ago
- A curated list of Video to Audio Generation☆37Updated last week
- "AI-Creator: Multi-Modal Agents for Video Production"☆82Updated this week
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆81Updated last year
- The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.☆33Updated 7 months ago
- ☆64Updated 7 months ago
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆181Updated last month
- An LLM-based agent simulation framework that simulates human behavior and generates dynamic, text-based social graphs.☆71Updated 2 weeks ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆64Updated 2 months ago
- flow mirror models from JZX AI Labs☆45Updated 6 months ago
- An easy-to-use, fast, and easily integrable tool for evaluating audio LLM☆89Updated last week
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆157Updated this week
- The official Soundwave repository☆198Updated last month
- OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Rea…☆41Updated last month
- We Speech Transcript based on LLM, in 300 lines of code.☆159Updated last week
- A project for tri-modal LLM benchmarking and instruction tuning.☆30Updated 3 weeks ago
- ☆219Updated last month
- 🤗 R1-AQA Model: mispeech/r1-aqa☆239Updated 3 weeks ago
- ☆45Updated last week
- CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages☆142Updated last month
- (ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆66Updated 2 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆44Updated 2 months ago
- LUCY: Linguistic Understanding and Control Yielding Early Stage of Her☆37Updated last week
- Deep Reasoning Translation via Reinforcement Learning (arXiv preprint 2025); DRT: Deep Reasoning Translation via Long Chain-of-Thought (a…☆215Updated last week
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆104Updated 2 weeks ago
- SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator☆70Updated 4 months ago
- Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))☆42Updated 8 months ago
- ☆16Updated 9 months ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆35Updated 2 months ago
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆309Updated 3 months ago