zyxcambridge / GPT4OLinks
复现GPT4O的实时视频和音频理解
☆14Updated last year
Alternatives and similar repositories for GPT4O
Users that are interested in GPT4O are comparing it to the libraries listed below
Sorting:
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆29Updated last year
- Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本…☆26Updated last year
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆56Updated last week
- EdgeInfer enables efficient edge intelligence by running small AI models, including embeddings and OnnxModels, on resource-constrained de…☆49Updated last year
- [EMNLP 2025 Demo] PresentAgent: Multimodal Agent for Presentation Video Generation☆121Updated last month
- Sora 中文的提示词 | 短视频提示词(prompt)技巧 | 调教指南。各种场景使用指南。学习怎么让 它听你的话。兼顾了 Sora 的多场景应用。☆110Updated this week
- 集成了LLM与SDXL的AIGC应用程序☆29Updated 2 years ago
- AI-agent应用,基于GPT、langchain、function calling、Stable diffusion等的AI儿童绘本生成☆25Updated 2 years ago
- Exploration of the multi modal fuyu-8b model of Adept. 🤓 🔍☆27Updated 2 years ago
- Prompt 工程师利器,可同时比较多个 Prompts 在多个 LLM 模型上的效果☆96Updated 2 years ago
- Collect VLM models that can be tried online.☆14Updated last year
- codewithgpu.com python client package☆20Updated 2 years ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22Updated 8 months ago
- Qwen-TTS offers a robust voice synthesis service using FastAPI, supporting bilingual and dialect options. Explore seamless audio generati…☆108Updated this week
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆13Updated last year
- AI开发者平台。目的是要搭建一个采集视频图像并调用API进行智能化数据标注,训练完成之后进行自动化测试的平台。☆33Updated 7 years ago
- Cross Platform Open Sourced Chinese NoteBookLM app based on Electron, Use DeepSeek + Reecho.ai☆82Updated last year
- Auto Thinking Mode switch for Qwen3 in Open webui☆70Updated 8 months ago
- ☆49Updated 5 months ago
- 油猴脚本添加github跳转deepwiki按钮☆46Updated last week
- Luann (fka TypeAgent) allows you to create many LLM based agent(Various types of agent,scale up)☆23Updated last month
- Creating Interactive and Embedded Physics Simulations from Static Textbook Diagrams☆29Updated 9 months ago
- Eko Browser Extension Template☆36Updated 7 months ago
- Turn Dify API into OpenAI API schema☆17Updated last year
- 一个可以验证和计算文本消耗 Token 的小工具,支持在浏览器中使用,汉化自 OpenAI Tokenizer。☆63Updated last year
- 使用强化学习训练PPT的Agent☆48Updated 2 months ago
- ☆25Updated last year
- EfficientSAM + YOLO World base model for use with Autodistill.☆10Updated last year
- ☆13Updated last year
- Collection of model-centric MCP servers☆24Updated 7 months ago