DS4SD / deepsearch-toolkit
Interact with the Deep Search platform for new knowledge explorations and discoveries
☆133Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for deepsearch-toolkit
- Examples using the Deep Search functionalities☆44Updated 3 months ago
- ☆34Updated last week
- Build document-native LLM applications☆50Updated 2 months ago
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)☆93Updated this week
- A spaCy wrapper for GliNER☆87Updated 3 months ago
- A python library to define and validate data types in Docling.☆28Updated this week
- Let's build better datasets, together!☆202Updated 3 months ago
- multimodal document analysis☆160Updated 5 months ago
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆21Updated 2 weeks ago
- A Python library aimed at dissecting and augmenting NER training data.☆56Updated last year
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)☆326Updated last month
- ☆82Updated 5 months ago
- Generalist and Lightweight Model for Text Classification☆48Updated 2 months ago
- 🦄 Unitxt: a python library for getting data fired up and set for training and evaluation☆159Updated this week
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆182Updated last month
- End-to-end zero-shot entity and relation extraction☆56Updated 3 months ago
- Scientific Document Insight Q/A☆23Updated 2 months ago
- GLiNER model in a FastAPI microservice.☆28Updated 2 weeks ago
- 💫 SpaCy wrapper for ConceptNet 💫☆88Updated last year
- SpanMarker for Named Entity Recognition☆398Updated 3 months ago
- Streamlit Annotation Tools is a Streamlit component that gives you access to various annotation tools (labeling, highlighting, etc.) for …☆77Updated 10 months ago
- Viewer for the structure extracted by Grobid on PDF documents☆38Updated this week
- Python API for https://vespa.ai, the open big data serving engine☆101Updated this week
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆52Updated 3 months ago
- SciRepEval benchmark training and evaluation scripts☆67Updated 5 months ago
- Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)☆348Updated 7 months ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆72Updated last year
- Open source project for data preparation of LLM application builders☆270Updated this week
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- Tools to scrape publication metadata from pubmed, arxiv, medrxiv and chemrxiv.☆239Updated this week