lfoppiano / document-qaLinks
Scientific Document Insight Q/A
☆32Updated 3 months ago
Alternatives and similar repositories for document-qa
Users that are interested in document-qa are comparing it to the libraries listed below
Sorting:
- Viewer for the structure extracted by Grobid on PDF documents☆57Updated last month
- ☆67Updated last year
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆52Updated 8 months ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆62Updated last year
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- Efficient few-shot learning with cross-encoders.☆60Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆71Updated last year
- ☆23Updated 2 years ago
- ☆28Updated last year
- PDF parser powered by grobid☆27Updated last year
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction☆82Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆193Updated 6 months ago
- GLiNER model in a FastAPI microservice.☆47Updated last year
- Streamlit Annotation Tools is a Streamlit component that gives you access to various annotation tools (labeling, highlighting, etc.) for …☆99Updated last year
- A spaCy wrapper for GliNER☆125Updated 10 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆63Updated last year
- Notebooks for training universal 0-shot classifiers on many different tasks☆137Updated 11 months ago
- A handy PDF-to-JSON conversion tool for academic papers implemented in Python.☆71Updated 2 years ago
- A python library for the Semantic Scholar (S2) API with typed pydantic objects and various nifty functionalities.☆22Updated 4 years ago
- ☆55Updated last year
- HDBSCAN Tuning for BERTopic Models☆49Updated 2 years ago
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- Generalist and Lightweight Model for Text Classification☆166Updated last week
- multimodal document analysis☆166Updated last month
- Small python package to measure OCR quality and other related metrics.☆25Updated last year
- Examples using the Deep Search functionalities☆85Updated 10 months ago
- Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.☆73Updated 11 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆111Updated last year
- This repository is part of an NLP course for humanities and cultural studies. This course uses historical newspapers as a source and appl…☆18Updated 6 months ago