win4r / VideoFinder-Llama3.2-vision-Ollama
VideoFinder is an advanced video analysis tool powered by multimodal AI, designed to help users easily locate and identify specific objects or people within video content. By combining the capabilities of Llama Vision model with a streamlined web interface, it enables real-time, frame-by-frame video analysis with natural language descriptions.
☆59Updated last week
Related projects ⓘ
Alternatives and complementary repositories for VideoFinder-Llama3.2-vision-Ollama
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆88Updated 3 months ago
- Sample GLM4V + ChatTTS AI assistant☆84Updated 5 months ago
- 文本语料转训练集工具,txt转dataset☆78Updated 6 months ago
- ChatTTS HTTP API☆48Updated 5 months ago
- ChatTTS is a generative speech model for daily dialogue.this fork Support ollama☆32Updated 5 months ago
- generate ppt with llm☆65Updated 8 months ago
- Using GPT to parse PDF☆68Updated 2 months ago
- TTS☆74Updated 5 months ago
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆24Updated 2 months ago
- 实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果☆254Updated 4 months ago
- 🌱 将智谱清言官方智能体API转换为OpenAI兼容协议的网关 👋 帮助开发者们降低接入API的门槛☆38Updated 6 months ago
- 无缝集成处理和调度 Dify & Dify on WeChat,Web 可视化多用户管理/一键启动 ChatBot,简化了令人惊叹且响应迅速的 ChatBot 应用程序的创建。☆44Updated 3 months ago
- This repo is to use chatTTS and Ollama to create local LLM audio tool.☆19Updated 4 months ago
- ☆68Updated 10 months ago
- MinerU是一款开源的高质量PDF解析工具,基于深度学习技术,可自动提取PDF文档中的文字、表格、图片、公式等内容,并提供丰富的分析、统计、搜索等功能。 本项目为其提供一个简化版本的WebUI,方便用户上传PDF文件,并实时展示提取结果。☆53Updated 3 weeks ago
- 一个用于CosyVoice的api接口项目☆79Updated 3 weeks ago
- ☆116Updated last week
- A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live…☆39Updated 5 months ago
- 基于 faster-whisper 的伪实时语音转写服务☆182Updated 2 months ago
- AI Q&A Search Engine ➡️ 基于LangChain和SearXNG打造的开源AI搜索引擎☆108Updated 2 months ago
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆197Updated last week
- EZ-Work AI文档翻译,人人可用的开源AI文档翻译助手,可以快速低成本调用OpenAI等大语言模型api,帮助您实现txt/markdown/word/csv/excel/pdf/ppt的文档翻译。☆128Updated this week
- GOT-OCR的GUI版本,提供OCR、导出PDF、批处理等功能,但不提供训练功能☆111Updated this week
- ☆142Updated 5 months ago
- Awada 是一个基于微信场景的团队知识助理智能体。它可以从群聊、公众号、网站等来源中进行在线自主学习(同时也接受自主文档上传),打造团队私域知识库,并为团队成员提供问答、资料查找以及写作(Word)服务。☆180Updated last week
- Real time faster whisper gradio☆25Updated last month
- Dive into LLM Agents☆15Updated 5 months ago
- 基于Linly-Talker数字人改版的教育系统,包含网课总结、数字人对话、Chatbot对话,项目可在autodl部署☆20Updated 5 months ago
- ChatPilot: Chat Agent Web UI,实现Chat对话前端,支持Google搜索、文件网址对话(RAG)、代码解释器功能,复现了Kimi Chat(文件,拖进来;网址,发出来)。☆510Updated last week
- 大模型中文测试题库-民间版本☆54Updated last year