explosion / spacy-layout
π Process PDFs, Word documents and more with spaCy
β412Updated last month
Alternatives and similar repositories for spacy-layout:
Users that are interested in spacy-layout are comparing it to the libraries listed below
- Fast Semantic Text Deduplicationβ525Updated this week
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)β385Updated 4 months ago
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)β184Updated last month
- 𦦠weasel: A small and easy workflow systemβ75Updated 7 months ago
- SpanMarker for Named Entity Recognitionβ417Updated last month
- A spaCy wrapper for GliNERβ107Updated 3 weeks ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β244Updated this week
- Late Interaction Models Training & Retrievalβ241Updated last week
- Efficiently find the best-suited language model (LM) for your NLP taskβ116Updated this week
- Fast State-of-the-Art Static Embeddingsβ1,060Updated this week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,295Updated last week
- A very simple news crawler with a funny nameβ333Updated this week
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β147Updated 4 months ago
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDFβ763Updated last week
- Extract structured text from pdfs quicklyβ418Updated this week
- Running Docling as an API serviceβ98Updated this week
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.β¦β231Updated 4 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β253Updated 2 months ago
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-sβ¦β212Updated last month
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β732Updated 3 weeks ago
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyβ1,019Updated last month
- β173Updated this week
- π¦ Integrating LLMs into structured NLP pipelinesβ1,193Updated last month
- Gain clues from clustering!β312Updated 7 months ago
- β207Updated 2 months ago
- A Lightweight Library for AI Observabilityβ233Updated this week
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ185Updated 4 months ago
- Visualize Different Text Splitting Methodsβ223Updated last month
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.β406Updated last year
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and croβ¦β746Updated 2 months ago