explosion / spacy-layout
π Process PDFs, Word documents and more with spaCy
β480Updated 2 weeks ago
Alternatives and similar repositories for spacy-layout:
Users that are interested in spacy-layout are comparing it to the libraries listed below
- Fast State-of-the-Art Static Embeddingsβ1,109Updated 3 weeks ago
- Fast Semantic Text Deduplicationβ582Updated 3 weeks ago
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)β400Updated 5 months ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.β265Updated this week
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)β196Updated this week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β753Updated last month
- Generic rag framework to apply the power of LLMs on any given datasetβ572Updated this week
- Efficiently find the best-suited language model (LM) for your NLP taskβ119Updated this week
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and croβ¦β770Updated 3 months ago
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyβ1,063Updated last week
- 𦦠weasel: A small and easy workflow systemβ76Updated 8 months ago
- SpanMarker for Named Entity Recognitionβ422Updated 2 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,338Updated last month
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.β¦β270Updated last week
- A very simple news crawler with a funny nameβ358Updated last week
- Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024β1,876Updated this week
- π¦ Integrating LLMs into structured NLP pipelinesβ1,213Updated 2 months ago
- β214Updated 3 months ago
- π©π»βπ³ A collection of example notebooksβ444Updated last week
- A spaCy wrapper for GliNERβ108Updated last month
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!β724Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β261Updated 3 months ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.β1,614Updated this week
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDFβ830Updated this week
- A system for agentic LLM-powered data processing and ETLβ1,718Updated this week
- Structured information extraction from documentsβ312Updated 5 months ago
- Developer APIs to Accelerate LLM Projectsβ1,615Updated 5 months ago
- Running Docling as an API serviceβ174Updated this week