ikantkode / qwen2.5VLM-OCRLinks
A simple streamlit app, dockerized, to do OCR on documents. I'm lazy, idk.
☆25Updated 2 months ago
Alternatives and similar repositories for qwen2.5VLM-OCR
Users that are interested in qwen2.5VLM-OCR are comparing it to the libraries listed below
Sorting:
- A simple CPU only OCR for pdf/images/word/excel to markdown. With streamlit.☆23Updated 2 months ago
- Datu Core AI Analyst open-source☆42Updated last month
- Fast local speech-to-text for any app using faster-whisper☆141Updated last month
- Open-source clone of the MidJourney web interface featuring real AI image and video generation powered by Google's Gemini SDK. Use Imagen…☆185Updated 3 months ago
- Your own Coding Agent 🤖☆86Updated 5 months ago
- Demo app for Groq plugins in LiveKit Agents☆56Updated 7 months ago
- Finally, an open source Youtube Summarizer extension☆79Updated 6 months ago
- A real-time shared memory layer for multi-agent LLM systems.☆48Updated 4 months ago
- DeepSite — AI website builder with local hosting support. Run DeepSite on your own server or offline.☆168Updated 2 months ago
- Local voice AI powered by Ollama, Kokoro, Whisper, and LiveKit.☆67Updated 6 months ago
- “A locally hosted, memory-aware AI microservice—designed for cultural continuity, decentralized intelligence, and ethical autonomy.”☆28Updated 6 months ago
- AI agents platform that gives you a workspace with an integrated team of personal assistants that can work behind the scenes to handle da…☆187Updated 3 months ago
- Command-line personal assistant using your favorite proprietary or local models with access to over 30+ tools☆112Updated 4 months ago
- 🔥 LitLytics - an affordable, simple analytics platform that leverages LLMs to automate data analysis☆102Updated 11 months ago
- ChatTTS + Ollama☆83Updated last year
- ☆54Updated 5 months ago
- An API for VoiceCraft.☆25Updated last year
- A cross patform app that unlocks your devices Gen AI capabilities☆65Updated last month
- PDF to MD UI - User Interface to Convert PDF to MarkDown for LLM and RAG☆48Updated last month
- The Ultimate Open-Source RAG-as-a-Service Platform ☕☆50Updated 4 months ago
- Whisper STT + Orpheus TTS + Gemma 3 using LM Studio to create a virtual assistant.☆70Updated 6 months ago
- Cascading voice assistant combining real-time speech recognition, AI reasoning, and neural text-to-speech capabilities.☆124Updated last month
- generate informative knowledge graph from text using open source models , ollama☆22Updated 2 months ago
- Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical user interface☆29Updated last week
- smart-llm-loader is a lightweight yet powerful Python package that transforms any document into LLM-ready chunks. Spend less time on prep…☆71Updated 8 months ago
- ☆28Updated 4 months ago
- Open source tool for transcirption and subtitling, alternative to happyscribe.☆30Updated 8 months ago
- Agent MCP for ffmpeg☆209Updated 4 months ago
- Garvis: Realtime AI Voice Assistant☆38Updated last year
- A Newsletter Agent that Aggregates Articles and Generates a Newsletter - Langflow, NextJS☆60Updated 11 months ago