alea-institute / nupunktLinks
Next-generation Punkt sentence boundary detection with zero dependencies
☆17Updated 2 weeks ago
Alternatives and similar repositories for nupunkt
Users that are interested in nupunkt are comparing it to the libraries listed below
Sorting:
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Updated last year
- 🌸 Train floret vectors☆18Updated 2 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Small python package to measure OCR quality and other related metrics.☆25Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆38Updated 6 years ago
- A simple library for segmenting legal texts☆17Updated 2 years ago
- 🍏 Make Thinc faster on macOS by calling into Apple's native Accelerate library☆99Updated last month
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆18Updated last year
- ☆55Updated last year
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆143Updated 7 months ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- scraping and querying documents for LLMs☆23Updated last week
- ☆76Updated 8 months ago
- ☆71Updated 2 years ago
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated 2 years ago
- Named entity recognition for the legal domain☆42Updated 4 years ago
- 🦦 weasel: A small and easy workflow system☆85Updated last year
- Language detection using Spacy and Fasttext☆57Updated last year
- spaCy entry points for Curated Transformers☆32Updated 2 months ago
- ☆67Updated last year
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- Tools for interactive visual exploration of semantic embeddings.☆35Updated 11 months ago
- LLM plugin for embeddings using sentence-transformers☆70Updated 3 months ago
- Plug-and-play document processing pipelines with zero-shot models.☆86Updated last week
- Finds linguistic patterns effortlessly☆37Updated last year
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated last year
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated last year
- An easy way to chunk spaCy docs.☆21Updated last year
- ☆19Updated 4 years ago