DS4SD / docling-core
A python library to define and validate data types in Docling.
☆28Updated this week
Related projects ⓘ
Alternatives and complementary repositories for docling-core
- ☆34Updated last week
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆21Updated 3 weeks ago
- Examples using the Deep Search functionalities☆44Updated 3 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆21Updated last week
- Build document-native LLM applications☆50Updated 2 months ago
- Interact with the Deep Search platform for new knowledge explorations and discoveries☆133Updated 3 weeks ago
- Running Docling as an API service☆13Updated last month
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆266Updated last year
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆19Updated last year
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆171Updated last week
- GLiNER model in a FastAPI microservice.☆28Updated 2 weeks ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆112Updated 10 months ago
- ☆47Updated last month
- Logical structure analysis for visually structured documents☆82Updated 2 years ago
- YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis☆61Updated last month
- ☆105Updated last month
- End-to-end zero-shot entity and relation extraction☆56Updated 3 months ago
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆115Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆59Updated last week
- A spaCy wrapper for GliNER☆87Updated 3 months ago
- LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.☆18Updated last month
- Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision m…☆62Updated this week
- LLM-driven automated knowledge graph construction from text using DSPy and Neo4j.☆153Updated 7 months ago
- How to construct knowledge graphs from unstructured data sources☆85Updated last month
- Python package that adds IntelligentGraph capabilities to RDFLib RDF graph package☆53Updated 10 months ago
- Viewer for the structure extracted by Grobid on PDF documents☆38Updated last week
- A Python library to chunk/group your texts based on semantic similarity.☆85Updated 4 months ago
- Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval an…☆25Updated last month
- Integrate AI-powered Document Analysis Pipelines☆61Updated last week
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆43Updated 3 months ago