win4r / VideoFinder-Llama3.2-vision-OllamaLinks
VideoFinder is an advanced video analysis tool powered by multimodal AI, designed to help users easily locate and identify specific objects or people within video content. By combining the capabilities of Llama Vision model with a streamlined web interface, it enables real-time, frame-by-frame video analysis with natural language descriptions.
☆156Updated 8 months ago
Alternatives and similar repositories for VideoFinder-Llama3.2-vision-Ollama
Users that are interested in VideoFinder-Llama3.2-vision-Ollama are comparing it to the libraries listed below
Sorting:
- 文本语料转训练集工具,txt转dataset☆93Updated last year
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆109Updated 10 months ago
- Sample GLM4V + ChatTTS AI assistant☆84Updated last year
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆118Updated 3 months ago
- 构建一个前端页面,通过flask框架实现OpenManus的前端调用。☆188Updated 3 months ago
- An common framework for voice and text interactions with LLMs☆93Updated 8 months ago
- ☆254Updated 6 months ago
- This repo is to use chatTTS and Ollama to create local LLM audio tool.☆34Updated last year
- generate ppt with llm☆98Updated last year
- A function calling tool can be deployed to Cloudflare Workers with openapi schema☆97Updated last year
- virtualwife-llm-factory 是一个llm训练框架,用于解决虚拟角色训练入门门槛高的问题,该框架具备自动生成语料,性格塑造评估,基于国产llm微调训练等核心能力,目前还在开发,可以点个star~ 关注一下☆48Updated last month
- AutoGen最新架构v0.4正式发布第一个稳定版本,v0.4是对AutoGen的一次从头开始的重写,目的是为构建Agent创建一个更健壮、可扩展、更易用的跨语言库,其应用接口采用分层架构设计,存在多套软件接口用以满足不 同的场景需求 。☆109Updated 3 months ago
- Dive into LLM Agents☆18Updated last year
- 视频理解:千问视频多模态模型 & Dify☆60Updated 10 months ago
- 实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果☆401Updated 6 months ago
- 使用CrewAI+FastAPI搭建多Agent协作应用并对外提供API服务,同时支持gpt、国产大模型、Ollama本地大模型。☆75Updated 8 months ago
- A tool for creating pre-training datasets for language models, supporting one-click batch processing for both text and image datasets. 一个…☆34Updated 7 months ago
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆255Updated 3 months ago
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆28Updated 9 months ago
- Using GPT to parse PDF☆99Updated 10 months ago
- Easegen is an open-source digital human course creation platform offering comprehensive solutions from course production and video manage…☆235Updated 2 months ago
- GraphRAG4OpenWebUI integrates Microsoft's GraphRAG technology into Open WebUI, providing a versatile information retrieval API. It combin…☆530Updated 6 months ago
- RAG-GPT, leveraging LLM and RAG technology, learns from user-customized knowledge bases to provide contextually relevant answers for a wi…☆453Updated last year
- 实现使用开源的LangFlow框架,零代码实现大模型相关应用如流量包推荐智能客服、RAG应用等,并使用两种方式将创建的工作流集成到自己的项目中☆26Updated 10 months ago
- 本项目主要实现使用FastAPI后端框架+CrewAI实现AI Agent复杂工作流。代码实现CrewAI的Flows功能,并支持Flow运行中间结果进行持久化存储和查询(MySQL),支持多Flow并行(Celery是一个强大的异步任务队列/作业队列库)。☆85Updated 3 months ago
- 基于SenseVoice的funasr版本进行的api发布,可以无缝对接oneapi☆67Updated 10 months ago
- GOT-OCR的GUI版本,提供OCR、导出PDF、批处理等功能,但不提供训练功能☆177Updated last week
- 😆 Generate PPT by LLM follow your template. 📢 Not only use llm to generate ppt, but also according to your favorite ppt template. Just…☆89Updated last year
- 支持中文🇨🇳🇨🇳🇨🇳 的 microsoft/graphrag☆48Updated 3 months ago
- ☆177Updated 5 months ago