neuml / txtmarker
Highlight text in documents
☆73Updated 11 months ago
Related projects: ⓘ
- Efficient few-shot learning with cross-encoders.☆35Updated 7 months ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆30Updated 5 months ago
- Repository for deepdoctection tutorial notebooks☆36Updated last month
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆21Updated last month
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆61Updated 6 months ago
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆57Updated 4 months ago
- 💫 SpaCy wrapper for ConceptNet 💫☆88Updated last year
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 7 months ago
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆43Updated 4 months ago
- Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision m…☆60Updated this week
- Logical structure analysis for visually structured documents☆80Updated 2 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆31Updated 3 years ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆41Updated last month
- arXiv plain text extraction☆41Updated last year
- Topic Inference with Zeroshot models☆61Updated last year
- Streamlit Named Entity Recognition (NER) annotation custom component☆38Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆84Updated 6 months ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆66Updated last year
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆37Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆30Updated 3 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆61Updated last month
- Python API for https://vespa.ai, the open big data serving engine☆89Updated this week
- A file utility for accessing both local and remote files through a unified interface.☆36Updated last month
- ChatBot App built using LangChain and Lightning AI☆17Updated last year
- Completion After Prompt Probability. Make your LLM make a choice☆68Updated last week
- A python library for extracting text from PDFs without losing the formatting of the PDF content.☆72Updated 2 years ago
- Granular Viewer of Sentiments Between Entities in Massively Large Documents and Collections of Texts, powered by AREkit☆36Updated last month