alea-institute / nupunktLinks
Next-generation Punkt sentence boundary detection with zero dependencies
☆17Updated 3 months ago
Alternatives and similar repositories for nupunkt
Users that are interested in nupunkt are comparing it to the libraries listed below
Sorting:
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆13Updated 10 months ago
- A simple library for segmenting legal texts☆17Updated 2 years ago
- Small python package to measure OCR quality and other related metrics.☆24Updated last year
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆38Updated 6 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- ☆55Updated last year
- 🌸 Train floret vectors☆18Updated 2 years ago
- Named entity recognition for the legal domain☆42Updated 4 years ago
- scraping and querying documents for LLMs☆22Updated last month
- spaCy entry points for Curated Transformers☆31Updated last month
- Language detection using Spacy and Fasttext☆55Updated last year
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated 2 years ago
- ☆69Updated 3 years ago
- Generate reports for spaCy models.☆29Updated 3 years ago
- Metadata Extractor & Loader (MEL) ■ The NLP-NER Toolkit (TNNT)☆23Updated 2 years ago
- Python package for deduplication/entity resolution using active learning☆80Updated 10 months ago
- ☆30Updated 3 years ago
- Pytorch implementation of a BiLSTM model for the Wikification project.☆19Updated 5 years ago
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆16Updated 3 weeks ago
- ☆67Updated last year
- Discourse Analysis Tool Suite☆26Updated this week
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆18Updated 10 months ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆162Updated 2 years ago
- Easy PDF to text to spaCy text extraction in Python.☆39Updated 9 months ago
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆142Updated 6 months ago
- CLI that queries multiple language models in parallel using prompts from a CSV file☆27Updated last month
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago