alea-institute / nupunktLinks
Next-generation Punkt sentence boundary detection with zero dependencies
☆17Updated 2 months ago
Alternatives and similar repositories for nupunkt
Users that are interested in nupunkt are comparing it to the libraries listed below
Sorting:
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆13Updated 10 months ago
- Small python package to measure OCR quality and other related metrics.☆23Updated last year
- A simple library for segmenting legal texts☆17Updated 2 years ago
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59Updated last year
- spaCy entry points for Curated Transformers☆31Updated 3 weeks ago
- ☆55Updated last year
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆38Updated 6 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- 🌸 Train floret vectors☆18Updated 2 years ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆18Updated 10 months ago
- Efficient few-shot learning with cross-encoders.☆52Updated last year
- CLI that queries multiple language models in parallel using prompts from a CSV file☆26Updated 3 weeks ago
- LLM plugin for embeddings using sentence-transformers☆66Updated last month
- Library for fast text representation and classification.☆30Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆58Updated last month
- ☆23Updated last year
- Python library to use Pleias-RAG models☆57Updated last month
- 🚂 Fine-tune OpenAI models for text classification, question answering, and more☆16Updated 2 years ago
- spaCy extension for Visual Studio Code☆32Updated 3 months ago
- ☆67Updated last year
- Code for SaGe subword tokenizer (EACL 2023)☆25Updated 6 months ago
- API client for fetching and comparing passages from legislation☆13Updated 4 months ago
- Adding Marimo to Datasette☆21Updated 2 months ago
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆16Updated this week
- Named entity recognition for the legal domain☆42Updated 4 years ago
- A BERT-based application for reusable text classification at scale☆38Updated last year
- 🔢 Work with static vector models☆28Updated 2 months ago
- ☆18Updated 4 years ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Updated 3 years ago
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆32Updated 2 months ago