pymupdf / pymupdf4llmLinks
PyMuPDF4LLM
☆1,152Updated last week
Alternatives and similar repositories for pymupdf4llm
Users that are interested in pymupdf4llm are comparing it to the libraries listed below
Sorting:
- Developer APIs to Accelerate LLM Projects☆1,741Updated last year
- Knowledge Agents and Management in the Cloud☆4,209Updated last week
- High-performance retrieval engine for unstructured data☆1,533Updated 3 weeks ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,456Updated 3 months ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,354Updated last week
- Extract structured text from pdfs quickly☆629Updated 5 months ago
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval☆1,487Updated last year
- Lightweight, performant, deep table extraction☆517Updated 3 months ago
- ☆1,400Updated last year
- Parse PDFs into markdown using Vision LLMs☆449Updated 2 months ago
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cro…☆892Updated 2 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,579Updated 6 months ago
- Running Docling as an API service☆998Updated last week
- Simple package to extract text with coordinates from programmatic PDFs☆218Updated last month
- 📚 Process PDFs, Word documents and more with spaCy☆816Updated 8 months ago
- This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.☆1,276Updated 8 months ago
- Neo4j GraphRAG for Python☆915Updated last month
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.☆486Updated last month
- Generic rag framework to apply the power of LLMs on any given dataset☆660Updated 3 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆832Updated 10 months ago
- Python bindings to PDFium, reasonably cross-platform.☆684Updated this week
- ☆846Updated last week
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆1,802Updated 7 months ago
- RAG that intelligently adapts to your use case, data, and queries☆3,611Updated last month
- Code for explaining and evaluating late chunking (chunked pooling)☆469Updated 11 months ago
- [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation☆1,237Updated this week
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.…☆454Updated 8 months ago
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆749Updated last week
- RAGChecker: A Fine-grained Framework For Diagnosing RAG☆1,024Updated 11 months ago
- Chat with multiple PDFs locally☆605Updated last month