junhoyeo / BetterOCR
π Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with π§ LLM.
β467Updated 4 months ago
Related projects: β
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.β1,999Updated 3 weeks ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backendβ479Updated 9 months ago
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ677Updated last month
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.β821Updated 8 months ago
- Remove background directly in your browser, powered by WebGPUβ416Updated 3 weeks ago
- High-performance retrieval engine for unstructured dataβ778Updated this week
- Extract structured text from pdfs quicklyβ292Updated 3 weeks ago
- TF-ID: Table/Figure IDentifier for academic papersβ206Updated 2 months ago
- β419Updated this week
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,509Updated last month
- UniTable: Towards a Unified Table Foundation Modelβ338Updated 3 months ago
- π Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and moβ¦β871Updated this week
- Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.β486Updated 2 months ago
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.β1,031Updated 3 weeks ago
- Zero shot pdf OCR with gpt-4o-miniβ1,392Updated this week
- Stateful load balancer custom-tailored for llama.cppβ518Updated this week
- A series of top performing Text to SQL LLMsβ857Updated 7 months ago
- Finetune llama2-70b and codellama on MacBook Air without quantizationβ443Updated 5 months ago
- This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR modeβ¦β861Updated 9 months ago
- (Cross-Platform) An open source approach to locally record and enable searching everything you view on any computer.β251Updated 4 months ago
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDFβ282Updated this week
- Crawls a Multi-Page Application to a zip file, serve the Multi-Page Application from the zip file. A MPA archiver. Could be used as a Sitβ¦β468Updated 2 months ago
- A framework for building, experimenting, deploying, and continuously iterating on your LLM applicationβ290Updated this week
- turnkey self-hosted offline transcription and diarization service with llm summaryβ689Updated 3 months ago
- File Parser optimised for LLM Ingestion with no loss π§ Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.β497Updated 3 weeks ago
- β544Updated this week
- An API to transcribe audio with OpenAI's Whisper Large v3!β166Updated 3 weeks ago
- Finetune a LLM to speak like you based on your WhatsApp Conversationsβ339Updated 4 months ago
- Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.β260Updated last month
- Dropbase helps developers build and prototype web apps faster with AI. Dropbase is local-first and self hosted.β1,054Updated this week