zyxcambridge / GPT4OLinks
复现GPT4O的实时视频和音频理解
☆14Updated last year
Alternatives and similar repositories for GPT4O
Users that are interested in GPT4O are comparing it to the libraries listed below
Sorting:
- EdgeInfer enables efficient edge intelligence by running small AI models, including embeddings and OnnxModels, on resource-constrained de…☆48Updated last year
- Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本…☆25Updated last year
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆28Updated last year
- Qwen-TTS offers a robust voice synthesis service using FastAPI, supporting bilingual and dialect options. Explore seamless audio generati…☆68Updated this week
- ☆25Updated last year
- 基于Roo Cline+DeepSeek的AI开发教程☆70Updated 7 months ago
- Cross Platform Open Sourced Chinese NoteBookLM app based on Electron, Use DeepSeek + Reecho.ai☆79Updated 11 months ago
- Sora 中文的提示词 | 短视频提示词(prompt)技巧 | 调教指南。各种场景使用指南。学习怎么让它听你的话。兼顾了 Sora 的多场景应用。☆65Updated this week
- [EMNLP 2025 Demo] PresentAgent: Multimodal Agent for Presentation Video Generation☆104Updated last week
- AI-agent应用,基于GPT、langchain、function calling、Stable diffusion等的AI儿童绘本生成☆24Updated 2 years ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22Updated 5 months ago
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆54Updated this week
- 视频理解:千问视频多模态模型 & Dify☆65Updated last year
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆21Updated last year
- support BM25+vecetor☆29Updated 4 months ago
- 2024 Alibaba Global Mathematics Competition AI Track Global 2nd Place Project (Agent Universe)☆73Updated last year
- AI开发者平台。目的是要搭建一个采集视频图像并调用API进行智能化数据标注,训练完成之后进行自动化测试的平台。☆32Updated 7 years ago
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆13Updated last year
- Open source intent recognition framework powered by LLMs.☆22Updated 9 months ago
- ☆54Updated 7 months ago
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆39Updated last year
- 基于Python3.10异步非阻塞框架Tornado6.0和前端Vue.js3框架实现ChatGPT的流式返回协议Server-sent events☆23Updated 2 years ago
- ☆36Updated 10 months ago
- codewithgpu.com python client package☆18Updated 2 years ago
- Eko Browser Extension Template☆33Updated 4 months ago
- 集成了LLM与SDXL的AIGC应用程序☆29Updated last year
- Big map for Google I/O 2025☆31Updated 4 months ago
- Creating Interactive and Embedded Physics Simulations from Static Textbook Diagrams☆23Updated 6 months ago
- WIP. Apps (100+) + AI.☆30Updated last year
- 1000个创业Idea,来自ycombinator,一行一个创业思路;1000 entrepreneurial ideas from ycombinator, one entrepreneurial idea per line.☆56Updated 9 months ago