win4r / VideoFinder-Llama3.2-vision-OllamaLinks
VideoFinder is an advanced video analysis tool powered by multimodal AI, designed to help users easily locate and identify specific objects or people within video content. By combining the capabilities of Llama Vision model with a streamlined web interface, it enables real-time, frame-by-frame video analysis with natural language descriptions.
☆154Updated 7 months ago
Alternatives and similar repositories for VideoFinder-Llama3.2-vision-Ollama
Users that are interested in VideoFinder-Llama3.2-vision-Ollama are comparing it to the libraries listed below
Sorting:
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆110Updated 10 months ago
- 文本语料转训练集工具,txt转dataset☆92Updated last year
- An common framework for voice and text interactions with LLMs☆93Updated 7 months ago
- 基于大模型的视频监控危险行为检测系统,集成YOLOv8、GPT-4V等视觉和多模态AI模型,提供高精度危险行为识别、场景理解和智能告警分析。☆38Updated 2 months ago
- Sample GLM4V + ChatTTS AI assistant☆84Updated last year
- ☆250Updated 6 months ago
- generate ppt with llm☆95Updated last year
- 微信机器人,接入 ChatGPT、讯飞星火、Tigerbot。☆35Updated last year
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆254Updated 3 months ago
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆108Updated 3 months ago
- ChatTTS HTTP API☆54Updated last year
- 视频理解:千问视频多模态模型 & Dify☆60Updated 9 months ago
- A Bob plugin that calls self-deployed Cosyvoice service to achieve TTS.☆37Updated 10 months ago
- 🔥 Turn entire websites into LLM-ready markdown☆90Updated last year
- 实现使用开源的LangFlow框架,零代码实现大模型相关应用如流量包推荐智能客服、RAG应用等,并使用两种方式将创建的工作流集成到自己的项目中☆25Updated 9 months ago
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆28Updated 9 months ago
- 无缝集成处理和调度 Dify & Dify on WeChat,Web 可视化多用户管理/一键启动 ChatBot,简化了令人惊叹且响应迅速的 ChatBot 应用程序的创建。☆69Updated 10 months ago
- 为AI带路党Pro视频准备☆254Updated 4 months ago
- 一个基于多模态向量模型及视觉多模态模型构建的图片搜索引擎&管理系统,实现精准的以文搜文,文搜图、以图搜图多种智能检索方式。An image search engine management system built upon multimodal vector models…☆42Updated last week
- 使用CrewAI+FastAPI搭建多Agent协作应用并对外提供API服务,同时支持gpt、国产大模型、Ollama本地大模型。☆73Updated 8 months ago
- This repo is to use chatTTS and Ollama to create local LLM audio tool.☆33Updated 11 months ago
- 阿里SenseVoice的fastpi封装,采用onnx发布,体积更小,附带量化模型,支持GPU。支持从URL文件进行语音识别。☆86Updated 9 months ago
- AutoGen最新架构v0.4正式发布第一个稳定版本,v0.4是对AutoGen的一次从头开始的重写,目的是为构建Agent创建一个更健壮、可扩展、更易用的跨语言库,其应用接口采用分层架构设计,存在多套软件接口用以满足不同的场景需求 。☆110Updated 2 months ago
- 与 https://github.com/tonori/mem0ai-api 配合使用的非官方的 mem0ai provider.☆48Updated 11 months ago
- 使用 FastAPI、Streamlit本地部署ChatTTS文本转语音模型,并通过 Docker Compose 进行容器化部署。☆27Updated 9 months ago
- ☆148Updated last year
- 异步语音对话组件。☆22Updated 3 months ago
- ☆53Updated 6 months ago
- 基于 Dify 构建的高级搜索工具☆28Updated 10 months ago
- ☆82Updated last year