explosion / spacy-layout
π Process PDFs, Word documents and more with spaCy
β579Updated 2 months ago
Alternatives and similar repositories for spacy-layout
Users that are interested in spacy-layout are comparing it to the libraries listed below
Sorting:
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)β207Updated last month
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)β418Updated 7 months ago
- SpanMarker for Named Entity Recognitionβ429Updated 4 months ago
- Extract structured text from pdfs quicklyβ474Updated 2 months ago
- 𦦠weasel: A small and easy workflow systemβ83Updated 10 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β777Updated 3 months ago
- A spaCy wrapper for GliNERβ114Updated 3 months ago
- Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024β2,001Updated this week
- β222Updated 5 months ago
- Fast Semantic Text Deduplication & Filteringβ659Updated 2 weeks ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β300Updated last month
- Simple package to extract text with coordinates from programmatic PDFsβ121Updated last month
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,408Updated this week
- A Python client for the Unstructured Platform APIβ101Updated this week
- Python bindings to PDFiumβ568Updated this week
- π¦ Integrating LLMs into structured NLP pipelinesβ1,241Updated 4 months ago
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and croβ¦β792Updated 5 months ago
- β122Updated 2 months ago
- A python library to define and validate data types in Docling.β131Updated this week
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyβ1,147Updated 3 weeks ago
- Efficiently find the best-suited language model (LM) for your NLP taskβ122Updated last week
- Generalist and Lightweight Model for Text Classificationβ124Updated last week
- LLM abstractions that aren't obstructionsβ1,095Updated this week
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-sβ¦β214Updated 3 months ago
- A very simple news crawler with a funny nameβ379Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β282Updated 2 weeks ago
- Robust and fast topic models with sentence-transformers.β48Updated this week
- β180Updated 3 weeks ago
- Structured information extraction from documentsβ314Updated 7 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embeddingβ2,044Updated this week