alea-institute / nupunktLinks
Next-generation Punkt sentence boundary detection with zero dependencies
☆27Updated 2 months ago
Alternatives and similar repositories for nupunkt
Users that are interested in nupunkt are comparing it to the libraries listed below
Sorting:
- Small python package to measure OCR quality and other related metrics.☆26Updated last year
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- spaCy entry points for Curated Transformers☆32Updated 8 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Updated last year
- ☆55Updated 2 years ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆21Updated last year
- Efficient BM25 with DuckDB 🦆☆61Updated last year
- ☆68Updated 3 years ago
- My NER Experiments with ModernBERT and Ettin☆26Updated 6 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Updated last year
- Source code and data for Like a Good Nearest Neighbor☆30Updated last year
- Named entity recognition for the legal domain☆42Updated 4 years ago
- Plug-and-play document AI with zero-shot models.☆123Updated last week
- Python package for deduplication/entity resolution using active learning☆83Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆170Updated 3 years ago
- Language detection using Spacy and Fasttext☆57Updated 2 years ago
- spaCy match and replace, maintaining conjugation☆36Updated 3 years ago
- A simple library for segmenting legal texts☆17Updated 2 years ago
- 🌸 Train floret vectors☆18Updated 2 years ago
- 🔢 Work with static vector models☆36Updated 9 months ago
- An open-source package for python to clean raw text data☆74Updated 2 years ago
- ☆67Updated last year
- ✂️ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) models☆34Updated 4 months ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆40Updated 6 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 3 years ago
- ☆17Updated 3 years ago
- Efficient few-shot learning with cross-encoders.☆61Updated last year
- LegalCrawler: A tool for automated scraping of English legal corpora☆59Updated 3 years ago
- ☆20Updated 4 years ago