explosion/spacy-layout

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/explosion/spacy-layout)

explosion / spacy-layout

📚 Process PDFs, Word documents and more with spaCy

☆909

Alternatives and similar repositories for spacy-layout

Users that are interested in spacy-layout are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

explosion / spacy-llm
View on GitHub
🦙 Integrating LLMs into structured NLP pipelines
☆1,394Mar 27, 2026Updated 3 months ago
urchade / GLiNER
View on GitHub
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts)
☆3,428Updated this week
theirstory / gliner-spacy
View on GitHub
A spaCy wrapper for GliNER
☆134Jan 29, 2025Updated last year
explosion / weasel
View on GitHub
🦦 weasel: A small and easy workflow system
☆93Mar 27, 2026Updated 3 months ago
explosion / spacy-curated-transformers
View on GitHub
spaCy entry points for Curated Transformers
☆32Mar 27, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
docling-project / docling
View on GitHub
Get your documents ready for gen AI
☆63,762Updated this week
jackboyla / GLiREL
View on GitHub
Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)
☆288Mar 30, 2026Updated 3 months ago
wjbmattingly / bagpipes-spacy
View on GitHub
Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.
☆22Aug 15, 2024Updated last year
wjbmattingly / spacyex
View on GitHub
SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.
☆59May 3, 2024Updated 2 years ago
explosion / spaCy
View on GitHub
💫 Industrial-strength Natural Language Processing (NLP) in Python
☆33,773May 19, 2026Updated 2 months ago
docling-project / docling-haystack
View on GitHub
Docling Haystack integration
☆29Apr 9, 2026Updated 3 months ago
explosion / spacy-vscode
View on GitHub
spaCy extension for Visual Studio Code
☆31Mar 10, 2025Updated last year
explosion / spacy-experimental
View on GitHub
🧪 Cutting-edge experimental spaCy components and features
☆104Apr 23, 2024Updated 2 years ago
wjbmattingly / date-spacy
View on GitHub
☆22Aug 24, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tomaarsen / SpanMarkerNER
View on GitHub
SpanMarker for Named Entity Recognition
☆477Apr 10, 2026Updated 3 months ago
explosion / prodigy-pdf
View on GitHub
A Prodigy plugin for PDF annotation
☆37Aug 11, 2025Updated 11 months ago
explosion / confection
View on GitHub
Confection: the sweetest config system for Python
☆194Mar 27, 2026Updated 3 months ago
Unstructured-IO / unstructured
View on GitHub
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…
☆15,197Updated this week
IBM / zshot
View on GitHub
Zero and Few shot named entity & relationships recognition
☆400Sep 17, 2025Updated 10 months ago
tomaarsen / module_dependencies
View on GitHub
Gather module dependencies of source code
☆13Jul 21, 2023Updated 3 years ago
MantisAI / sieves
View on GitHub
Plug-and-play document AI with zero-shot models.
☆126May 11, 2026Updated 2 months ago
richardpaulhudson / coreferee
View on GitHub
Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…
☆138Apr 23, 2024Updated 2 years ago
Lucaterre / spacyfishing
View on GitHub
A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata
☆173Nov 7, 2022Updated 3 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
wjbmattingly / keyword-spacy
View on GitHub
Keyword spaCy is a spaCy pipeline component for extracting keywords from text using cosine similarity.
☆14Dec 7, 2023Updated 2 years ago
wjbmattingly / biospacy
View on GitHub
☆22Jan 2, 2023Updated 3 years ago
neuml / staticvectors
View on GitHub
🔢 Work with static vector models
☆39Apr 21, 2025Updated last year
AnswerDotAI / RAGatouille
View on GitHub
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…
☆3,943May 17, 2025Updated last year
ucbepic / docetl
View on GitHub
A system for agentic LLM-powered data processing and ETL
☆3,947Updated this week
illuin-tech / colpali
View on GitHub
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
☆2,707Jul 13, 2026Updated last week
fastino-ai / GLiNER2
View on GitHub
Unified Schema-Based Information Extraction
☆1,712Updated this week
allenai / scispacy
View on GitHub
A full spaCy pipeline and models for scientific/biomedical documents.
☆1,977Dec 4, 2025Updated 7 months ago
MartinoMensio / spacy-universal-sentence-encoder
View on GitHub
Google USE (Universal Sentence Encoder) for spaCy
☆182Mar 24, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
koaning / embetter
View on GitHub
just a bunch of useful embeddings for scikit-learn pipelines
☆527Feb 12, 2026Updated 5 months ago
docling-project / docling-parse
View on GitHub
Simple package to extract text with coordinates from programmatic PDFs
☆326Updated this week
explosion / spacy-huggingface-pipelines
View on GitHub
💥 Use Hugging Face text and token classification pipelines directly in spaCy
☆65Mar 18, 2024Updated 2 years ago
explosion / projects
View on GitHub
🪐 End-to-end NLP workflows from prototype to production
☆1,432Oct 15, 2024Updated last year
lightonai / pylate
View on GitHub
Late Interaction Models Training & Retrieval
☆876Updated this week
stanfordnlp / dspy
View on GitHub
DSPy: The framework for programming—not prompting—language models
☆36,371Updated this week
datalab-to / marker
View on GitHub
Convert PDF to markdown + JSON quickly with high accuracy
☆37,843Updated this week