alea-institute / nupunktLinks
Next-generation Punkt sentence boundary detection with zero dependencies
☆26Updated last month
Alternatives and similar repositories for nupunkt
Users that are interested in nupunkt are comparing it to the libraries listed below
Sorting:
- Small python package to measure OCR quality and other related metrics.☆25Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated last year
- spaCy entry points for Curated Transformers☆32Updated 6 months ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Updated last year
- ☆55Updated last year
- Code for SaGe subword tokenizer (EACL 2023)☆27Updated last year
- Named entity recognition for the legal domain☆42Updated 4 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆40Updated 6 years ago
- ☆30Updated 3 years ago
- 🌸 Train floret vectors☆18Updated 2 years ago
- ☆69Updated 3 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 3 years ago
- LegalCrawler: A tool for automated scraping of English legal corpora☆59Updated 3 years ago
- A simple library for segmenting legal texts☆17Updated 2 years ago
- Finds linguistic patterns effortlessly☆39Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆30Updated 11 months ago
- ☆20Updated 4 years ago
- Generate reports for spaCy models.☆29Updated 3 years ago
- Library for fast text representation and classification.☆31Updated last year
- Tools for interactive visual exploration of semantic embeddings.☆39Updated last year
- It's a cooler way to store simple linear models.☆27Updated last year
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- Efficient BM25 with DuckDB 🦆☆59Updated last year
- ☆17Updated 2 years ago
- Mining Legal Arguments in Court Decisions - Data and software☆73Updated 2 years ago
- Plug-and-play document AI with zero-shot models.☆120Updated this week
- Python based Wikidata framework for easy dataframe extraction☆45Updated 2 years ago
- ☆70Updated 3 years ago