win4r / VideoFinder-Llama3.2-vision-OllamaLinks
VideoFinder is an advanced video analysis tool powered by multimodal AI, designed to help users easily locate and identify specific objects or people within video content. By combining the capabilities of Llama Vision model with a streamlined web interface, it enables real-time, frame-by-frame video analysis with natural language descriptions.
☆164Updated 11 months ago
Alternatives and similar repositories for VideoFinder-Llama3.2-vision-Ollama
Users that are interested in VideoFinder-Llama3.2-vision-Ollama are comparing it to the libraries listed below
Sorting:
- Sample GLM4V + ChatTTS AI assistant☆85Updated last year
- 文本语料转训练集工具,txt转dataset☆94Updated last year
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆104Updated last year
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆128Updated 7 months ago
- MinerU API server☆78Updated 10 months ago
- An common framework for voice and text interactions with LLMs☆97Updated 11 months ago
- ☆271Updated 10 months ago
- AutoGen最新架构v0.4正式发布第一个稳定版本,v0.4是对AutoGen的一次从头开始的重写,目的是为构建Agent创建一个更健壮、可扩展、更易用的跨语言库,其应用接口采用分层架构设计,存在多套软件接口用以满足不同的场景需求 。☆111Updated 6 months ago
- This repo is to use chatTTS and Ollama to create local LLM audio tool.☆34Updated last year
- 微软开源多Agent智能体协作框架AutoGen全新改版核心概念介绍及相关案例测试☆48Updated 10 months ago
- 一个基于多模态向量模型及视觉多模态模型构建的图片搜索引擎&管理系统,实现精准的以文搜文,文搜图、以图搜图多种智能检索方式。An image search engine management system built upon multimodal vector models…☆65Updated last month
- XAgent 教程☆36Updated 2 years ago
- GOT-OCR的GUI版本,提供OCR、导出PDF、批处理等功能,但不提供训练功能☆182Updated 2 weeks ago
- 视频理解:千问视频多模态模型 & Dify☆65Updated last year
- ChatTTS is a generative speech model for daily dialogue.this fork Support ollama☆44Updated last year
- 基于Linly-Talker数字人改版的教育系统,包含网课总结、数字人对话、Chatbot对话,项目可在autodl部署☆34Updated last year
- generate ppt with llm☆101Updated last year
- ☆285Updated last year
- 实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果☆421Updated 10 months ago
- TTS☆79Updated last year
- Phi3 中文后训练模型仓库☆324Updated 11 months ago
- 阿里SenseVoice的fastpi封装,采用onnx发布,体积更小,附带量化模型,支持GPU。支持从URL文件进行语音识别。☆103Updated last year
- ☆148Updated last year
- 基于 faster-whisper 的伪实时语音转写服务☆230Updated 6 months ago
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆260Updated 7 months ago
- virtualwife-llm-factory 是一个llm训练框架,用于解决虚拟角色训练入门门槛高的问题,该框架具备自动生成语料,性格塑造评估,基于国产llm微调训练等核心能力,目前还在开发,可以点个star~ 关注一下☆50Updated 4 months ago
- 为AI带路党Pro视频准备☆274Updated 8 months ago
- LLM voice chat project by Connect ChatTTS with Local Ollama, 连接本地部署的 Ollama 和 ChatTTS,实现和LLM的语音对话☆64Updated last year
- 基于SenseVoice的funasr版本进行的api发布,可以无缝对接oneapi☆83Updated last year
- A function calling tool can be deployed to Cloudflare Workers with openapi schema☆100Updated last year