DS4SD / deepsearch-toolkit
Interact with the Deep Search platform for new knowledge explorations and discoveries
☆181Updated 2 months ago
Alternatives and similar repositories for deepsearch-toolkit:
Users that are interested in deepsearch-toolkit are comparing it to the libraries listed below
- Examples using the Deep Search functionalities☆69Updated last month
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆48Updated last month
- Build document-native LLM applications☆52Updated 6 months ago
- ☆87Updated last week
- Simple package to extract text with coordinates from programmatic PDFs☆83Updated last week
- A python library to define and validate data types in Docling.☆92Updated this week
- Generalist and Lightweight Model for Text Classification☆92Updated this week
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)☆196Updated this week
- Running Docling as an API service☆174Updated this week
- 🦄 Unitxt: a python library for getting data fired up and set for training and evaluation☆181Updated this week
- DocLLM: A layout-aware generative language model for multimodal document understanding☆123Updated last year
- python package to parse pdfs with different parsers☆35Updated 3 months ago
- A spaCy wrapper for GliNER☆108Updated last month
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆327Updated 2 years ago
- Repository for ACL paper: "Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs"☆13Updated 8 months ago
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)☆400Updated 5 months ago
- Extract tables from PDFs using LLMWhisperer and extract structured information from those tables using Langchain☆35Updated 5 months ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆175Updated 2 years ago
- MedEmbed is a collection of embedding models fine-tuned specifically for medical and clinical data.☆44Updated 5 months ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆190Updated 5 months ago
- ☆176Updated last week
- ☆118Updated 3 weeks ago
- Chunk your text using gpt4o-mini more accurately☆44Updated 7 months ago
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆109Updated 8 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆158Updated 6 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆183Updated last week
- multimodal document analysis☆164Updated 9 months ago
- Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.☆61Updated 3 months ago
- General solution to archetype LLM batch use case☆34Updated last year
- Scientific Document Insight Q/A☆29Updated last month