explosion / spacy-layoutLinks
π Process PDFs, Word documents and more with spaCy
β615Updated 2 months ago
Alternatives and similar repositories for spacy-layout
Users that are interested in spacy-layout are comparing it to the libraries listed below
Sorting:
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)β424Updated 8 months ago
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)β212Updated last month
- Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024β2,053Updated last week
- SpanMarker for Named Entity Recognitionβ431Updated 4 months ago
- π¦ Integrating LLMs into structured NLP pipelinesβ1,254Updated 4 months ago
- Fast Semantic Text Deduplication & Filteringβ697Updated last week
- Simple package to extract text with coordinates from programmatic PDFsβ126Updated last month
- A spaCy wrapper for GliNERβ115Updated 4 months ago
- A python library to define and validate data types in Docling.β137Updated last week
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyβ1,175Updated last week
- Fast State-of-the-Art Static Embeddingsβ1,706Updated this week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,436Updated last week
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β315Updated 2 months ago
- Extract structured text from pdfs quicklyβ481Updated this week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β789Updated 4 months ago
- Late Interaction Models Training & Retrievalβ395Updated last week
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!β770Updated last week
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.β¦β323Updated 2 months ago
- Code for explaining and evaluating late chunking (chunked pooling)β390Updated 5 months ago
- Easily deploy Haystack pipelines as REST APIs and MCP Tools.β83Updated last week
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-sβ¦β214Updated 4 months ago
- β183Updated this week
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and croβ¦β809Updated 6 months ago
- β225Updated 5 months ago
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDFβ937Updated 3 weeks ago
- β122Updated 3 months ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has stβ¦β1,117Updated last month
- β122Updated this week
- β1,196Updated 11 months ago
- β‘οΈA Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion πβ559Updated 11 months ago