junhoyeo / BetterOCR
π Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with π§ LLM.
β538Updated 2 months ago
Alternatives and similar repositories for BetterOCR:
Users that are interested in BetterOCR are comparing it to the libraries listed below
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing anβ¦β844Updated 6 months ago
- Provides OCR (Optical Character Recognition) services through web applicationsβ674Updated last year
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps π£οΈπ―β830Updated last month
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.β2,598Updated last month
- Extract structured text from pdfs quicklyβ454Updated last month
- Create API agents from OpenAPI Specsβ178Updated last year
- Build, Improve Performance, and Productionize your LLM Application with an Integrated Frameworkβ338Updated 4 months ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backendβ489Updated last year
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.β853Updated last year
- UniTable: Towards a Unified Table Foundation Modelβ455Updated 10 months ago
- Lightweight, performant, deep table extractionβ442Updated last week
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkitβ758Updated 7 months ago
- Examples and guides for using the VLM Run APIβ268Updated 2 weeks ago
- Action library for AI Agentβ212Updated last week
- Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)β18Updated this week
- Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.β275Updated 3 weeks ago
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.β1,214Updated last week
- Parse vision is an open source tool to visualise what OCR is parsing in a PDF document to help developers and product teams identify if tβ¦β83Updated 8 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,591Updated 8 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient β¦β220Updated 3 months ago
- Structured and typehinted GPT responses in Pythonβ736Updated 8 months ago
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.β685Updated 11 months ago
- A hub for various industry-specific schemas to be used with VLMs.β496Updated this week
- Crawls a Multi-Page Application to a zip file, serve the Multi-Page Application from the zip file. A MPA archiver. Could be used as a Sitβ¦β475Updated 6 months ago
- A series of top performing Text to SQL LLMsβ869Updated last year
- β‘οΈ 80x faster Fasttext language detection out of the box | Split text by languageβ182Updated last week
- OCR Benchmarkβ435Updated this week
- Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...β279Updated last year
- Summarize and query from a lot of heterogeneous documents. Any LLM provider, any filetype, scalable (?), WIPβ442Updated last week
- Open-source platform for extracting structured data from documents using AI.β1,288Updated last month