win4r / VideoFinder-Llama3.2-vision-OllamaLinks
VideoFinder is an advanced video analysis tool powered by multimodal AI, designed to help users easily locate and identify specific objects or people within video content. By combining the capabilities of Llama Vision model with a streamlined web interface, it enables real-time, frame-by-frame video analysis with natural language descriptions.
☆169Updated last year
Alternatives and similar repositories for VideoFinder-Llama3.2-vision-Ollama
Users that are interested in VideoFinder-Llama3.2-vision-Ollama are comparing it to the libraries listed below
Sorting:
- 文本语料转训练集工具,txt转dataset☆93Updated last year
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆105Updated last year
- Sample GLM4V + ChatTTS AI assistant☆85Updated last year
- An common framework for voice and text interactions with LLMs☆98Updated last year
- ☆274Updated last year
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆132Updated 10 months ago
- A tool for creating pre-training datasets for language models, supporting one-click batch processing for both text and image datasets. 一个…☆43Updated last year
- Dive into LLM Agents☆18Updated last year
- 构建一个前端页面,通过flask 框架实现OpenManus的前端调用。☆220Updated 9 months ago
- generate ppt with llm☆106Updated last year
- 视频理解:千问视频多模态模型 & Dify☆66Updated last year
- OpenKAG (Open Knowledge Augmented Generation), is an enterprise intelligent knowledge platform based on large model technology.☆57Updated 9 months ago
- virtualwife-llm-factory 是一个llm训练框架,用于解决虚拟角色训练入门门槛高的问题,该框架具备自动生成语料,性格塑造评估,基于国产llm微调训练等核心能力,目前还在开发,可以点个star~ 关注一下☆50Updated 7 months ago
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆29Updated last year
- 基于 faster-whisper 的伪实时语音转写服务☆236Updated 9 months ago
- ragflow中的ocr部分,非官方项目☆54Updated last year
- A function calling tool can be deployed to Cloudflare Workers with openapi schema☆99Updated last year
- 阿里SenseVoice的fastpi封装,采用onnx发布,体积更小,附带量化模型,支持GPU。支持从URL文件进行语音识别。☆104Updated last year
- ☆184Updated 2 months ago
- (整合包Integrated package)一键使用面壁智能最新的MiniCPM-o 2.6多模态模型,用于视频对话、语音对话和文字对话。|Use Modelbest's latest MiniCPM-o 2.6 multi-modal model with one c…☆15Updated 6 months ago
- TTS☆80Updated last year
- 基于SenseVoice的funasr版本进行的api发布,可以无缝对接oneapi☆92Updated last year
- 实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果☆429Updated last year
- 用于提供给本地开发者的 LLM的高效互联网搜索&内容获取的MCP Server, 节省你的token☆127Updated last week
- ChatTTS HTTP API☆54Updated last year
- RAG-GPT, leveraging LLM and RAG technology, learns from user-customized knowledge bases to provide contextually relevant answers for a wi…☆476Updated last year
- The tool is used for building and driving workflows specifically tailored for AI initiatives. It can be used to construct AI agents.☆161Updated last year
- Easegen is an open-source digital human course creation platform offering comprehensive solutions from course production and video manage…☆249Updated last month
- ☆149Updated last year
- ☆447Updated last month