ses4255 / Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
β609Updated 2 weeks ago
Alternatives and similar repositories for Versatile-OCR-Program:
Users that are interested in Versatile-OCR-Program are comparing it to the libraries listed below
- β813Updated last week
- π discover story relationshipsβ222Updated 2 months ago
- An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing anβ¦β848Updated 7 months ago
- β436Updated 7 months ago
- Summarize and query from a lot of heterogeneous documents. Any LLM provider, any filetype, scalable (?), WIPβ447Updated last week
- A powerful document AI question-answering tool that connects to your local Ollama models. Create, manage, and interact with RAG systems fβ¦β969Updated 3 weeks ago
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a ligβ¦β222Updated 4 months ago
- Parse PDFs into markdown using Vision LLMsβ345Updated 2 months ago
- Animating R1's thoughts.β377Updated 2 months ago
- Fully neural approach for text chunkingβ319Updated last week
- Web scraper made for AI and simplicity in mind. It runs as a CLI that can be parallelized and outputs high-quality markdown content.β515Updated 2 months ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps π£οΈπ―β833Updated last month
- A text extraction library supporting PDFs, images, office documents and moreβ1,784Updated 2 weeks ago
- VSCode extension that demonstrates the use of large language models (LLMs) for active debugging of programsβ328Updated 2 months ago
- Examples and guides for using the VLM Run APIβ274Updated last month
- Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.β1,064Updated 4 months ago
- SOTA Open-Source Browser Agent for autonomously performing complex tasks on the webβ1,548Updated this week
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaboratβ¦β422Updated last week
- MCP server for fetch web page content using Playwright headless browser.β607Updated 2 weeks ago
- OCR Benchmarkβ464Updated last week
- A hub for various industry-specific schemas to be used with VLMs.β498Updated 2 weeks ago
- Detect and extract tables to markdown and csvβ742Updated 3 months ago
- A self-hosted API that takes a URL and returns a file with browser screenshots.β959Updated last month
- Visualise your CSV files in seconds without sending your data anywhereβ505Updated last month
- Integrate LLM in any pipeline - fit/predict pattern, JSON driven flows, and built in concurency support.β586Updated last month
- Fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models.β292Updated this week
- Open source multi-modal RAG for building AI apps over private knowledge.β1,667Updated this week
- Open-source unstructured data (PDFs, Images, Audiofiles) processing platform built for knowledge workersβ275Updated last month
- OpenAI DeepResearch alternative, An AI-driven research system that performs comprehensive, iterative research on any topic using multipleβ¦β561Updated last month
- With one command, create a natural-sounding audiobook from a variety of input formats (epub, mobi, txt, PDF, HTML and more!)β646Updated last month