alea-institute / nupunktLinks
Next-generation Punkt sentence boundary detection with zero dependencies
☆17Updated last month
Alternatives and similar repositories for nupunkt
Users that are interested in nupunkt are comparing it to the libraries listed below
Sorting:
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- A simple library for segmenting legal texts☆17Updated 2 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆39Updated 6 years ago
- Small python package to measure OCR quality and other related metrics.☆25Updated last year
- ☆55Updated last year
- 🌸 Train floret vectors☆18Updated 2 years ago
- spaCy entry points for Curated Transformers☆32Updated 4 months ago
- Named entity recognition for the legal domain☆42Updated 4 years ago
- Tools for interactive visual exploration of semantic embeddings.☆38Updated last year
- ☆69Updated 3 years ago
- 🍏 Make Thinc faster on macOS by calling into Apple's native Accelerate library☆101Updated 3 months ago
- Mining Legal Arguments in Court Decisions - Data and software☆69Updated 2 years ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- Language detection using Spacy and Fasttext☆57Updated last year
- API client for fetching and comparing passages from legislation☆14Updated 8 months ago
- Enhaced version of Wikiextrator: A wikipedia dumps extractor☆21Updated 2 weeks ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated last year
- Generate reports for spaCy models.☆29Updated 3 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- Finds linguistic patterns effortlessly☆38Updated 2 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆164Updated 2 years ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆19Updated last year
- ☆19Updated 4 years ago
- A maximum-strength name parser for record linkage.☆38Updated last month
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- Efficient BM25 with DuckDB 🦆☆55Updated 9 months ago
- scraping and querying documents for LLMs☆24Updated this week
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago