Parse PDFs into markdown using Vision LLMs
☆472Oct 4, 2025Updated 7 months ago
Alternatives and similar repositories for vision-parse
Users that are interested in vision-parse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PDF intelligence platform combining IBM Docling for document processing, LlamaIndex for data structuring, and Streamlit for a powerful UI…☆52Dec 30, 2024Updated last year
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,535Aug 27, 2025Updated 8 months ago
- Convert PDF to markdown + JSON quickly with high accuracy☆34,606Apr 24, 2026Updated last week
- OCR & Document Extraction using vision models☆12,227May 20, 2025Updated 11 months ago
- NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extracti…☆2,915Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆11Oct 24, 2024Updated last year
- Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents…☆3,097Dec 8, 2025Updated 4 months ago
- Improved file parsing for LLM’s☆3,155Nov 13, 2024Updated last year
- Turn local files into a prompt for an LLM☆176Jan 19, 2025Updated last year
- Open source alternative to Gemini Deep Research. Generate reports with AI based on search results.☆2,132Dec 15, 2025Updated 4 months ago
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆227Dec 24, 2024Updated last year
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,354Feb 21, 2025Updated last year
- Multi-agent that helps you organize and write documents.☆351Nov 15, 2024Updated last year
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,626Jan 3, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with ded…☆1,292Sep 8, 2024Updated last year
- MilimoChat: Privacy-first, self-hosted AI chat with customizable personas, context-aware memory, and local analytics. Built on Python/Str…☆14Mar 12, 2025Updated last year
- Get your documents ready for gen AI☆59,087Updated this week
- AI reads books: Page-by-Page PDF Knowledge Extractor & Summarizer. script performs an intelligent page-by-page analysis of PDF books, met…☆2,103Jan 20, 2025Updated last year
- ☆49Sep 11, 2025Updated 7 months ago
- Toolkit for linearizing PDFs for LLM datasets/training☆17,231Mar 25, 2026Updated last month
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.☆7,795Nov 7, 2025Updated 5 months ago
- Knowledge Agents and Management in the Cloud☆4,250Apr 27, 2026Updated last week
- GraphRAG using Local LLMs - Features robust API and multiple apps for Indexing/Prompt Tuning/Query/Chat/Visualizing/Etc. This is meant to…☆2,325Nov 9, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation☆4,745Apr 26, 2026Updated last week
- Deep research agent to help you find the best GitHub repositories 🕵️!☆867Apr 8, 2026Updated 3 weeks ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆236Apr 28, 2026Updated last week
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,680Apr 24, 2026Updated last week
- LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows☆6,557Updated this week
- CLI that uses DSPy to interact with MCP servers.☆24Mar 10, 2025Updated last year
- PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.☆5,417Updated this week
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine☆500Jul 23, 2025Updated 9 months ago
- Very minimal (and stateless) agent framework☆44Jan 12, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- RAG that intelligently adapts to your use case, data, and queries☆3,777Nov 1, 2025Updated 6 months ago
- Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.☆61,724Updated this week
- [AAAI 2025] StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization☆226Jan 13, 2026Updated 3 months ago
- Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.☆93Mar 20, 2025Updated last year
- Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.☆1,747Dec 21, 2024Updated last year
- Structured data extraction and instruction calling with ML, LLM and Vision LLM☆5,153Updated this week
- Using GPT to parse PDF☆3,551Apr 17, 2025Updated last year