alea-institute/nupunkt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alea-institute/nupunkt)

alea-institute / nupunkt

Next-generation Punkt sentence boundary detection with zero dependencies

☆32

Alternatives and similar repositories for nupunkt

Users that are interested in nupunkt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jsavelka / sbd_adjudicatory_dec
View on GitHub
☆20Jun 11, 2021Updated 5 years ago
echogarden-project / text-segmentation
View on GitHub
A library for multilingual word, phrase and sentence segmentation.
☆16Updated this week
feyninc / tokie
View on GitHub
🍡 30x faster tokenization for every HuggingFace model
☆49Updated this week
coastalcph / lexlms
View on GitHub
LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development
☆23Jul 24, 2023Updated 3 years ago
cicero-im / prompting
View on GitHub
Prompting Techniques for Attorneys
☆18May 30, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sali-legal / LMSS
View on GitHub
SALI LMSS: Legal Matter Standard Specification
☆82Mar 10, 2026Updated 4 months ago
medelman17 / socrates-api
View on GitHub
Socrates is a thin wrapper around an early-stage [AllenNLP](https://allennlp.org/) model that enables machine reading comprehension (MRC)…
☆14Jan 12, 2021Updated 5 years ago
stephantul / pynife
View on GitHub
Nearly Inference Free Embeddings: make your RAG queries 500x faster
☆80Apr 27, 2026Updated 3 months ago
sboghossian / nomos
View on GitHub
Nomos — a programming language for legal reasoning. Typed rules with jurisdiction and validity dates, LLM-powered fact extraction, defeas…
☆20Apr 21, 2026Updated 3 months ago
thunlp / LEAD
View on GitHub
Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)
☆17Nov 17, 2024Updated last year
chrisstiles / PublishDateBot
View on GitHub
A reddit bot that finds original publish dates on linked articles.
☆10Nov 30, 2024Updated last year
gambolputty / newscorpus
View on GitHub
A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.
☆20Jul 5, 2024Updated 2 years ago
lm-pub-quiz / lm-pub-quiz
View on GitHub
Evaluate language models using multiple choice items
☆13Mar 6, 2026Updated 4 months ago
273v / python-lmss
View on GitHub
Legal Matter Standard Specification (LMSS) library for Python
☆17Nov 14, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
pdufter / densray
View on GitHub
Getting interpretable dimensions in word embedding spaces.
☆15Jul 6, 2023Updated 3 years ago
cisnlp / GlotWeb
View on GitHub
[WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages
☆17Apr 14, 2026Updated 3 months ago
neuml / staticvectors
View on GitHub
🔢 Work with static vector models
☆39Apr 21, 2025Updated last year
mjbommar / ai-law-finance-book
View on GitHub
☆69Jan 28, 2026Updated 6 months ago
john-friedman / secxbrl
View on GitHub
A package to parse SEC XBRL at scale.
☆19Nov 25, 2025Updated 8 months ago
CaseMark / skills
View on GitHub
☆29Jun 12, 2026Updated last month
tigerchen52 / GLADIS
View on GitHub
GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)
☆18Jun 24, 2024Updated 2 years ago
noslegal / taxonomy
View on GitHub
noslegal taxonomy facets and release notes
☆44May 29, 2026Updated last month
SELMA-project / ml4audio
View on GitHub
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆11Sep 4, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
dwillis / shot-scraper-nicar24
View on GitHub
☆15Mar 11, 2024Updated 2 years ago
andreburgaud / robotspy
View on GitHub
Alternative robots parser module for Python
☆22Jun 19, 2026Updated last month
freelawproject / inception
View on GitHub
Our microservice for generating embeddings from blocks of text
☆54Feb 20, 2026Updated 5 months ago
alea-institute / kl3m-data
View on GitHub
KL3M training data collection and preprocessing
☆22Apr 14, 2025Updated last year
smucclaw / l4-ide
View on GitHub
This project is being continued by Legalese:
☆37Jul 15, 2026Updated 2 weeks ago
iliaschalkidis / flash-roberta
View on GitHub
Hugging Face RoBERTa with Flash Attention 2
☆24Sep 14, 2025Updated 10 months ago
transitive-bullshit / abstract-object-storage
View on GitHub
Collection of useful utilities for working with Google Cloud Storage.
☆13Dec 9, 2022Updated 3 years ago
lacuna-technologies / clerkent
View on GitHub
[Moved to https://git.huey.xyz/lacuna-technologies/clerkent] quickly and easily search for and download case law; automatically rename do…
☆28Jan 2, 2026Updated 6 months ago
JSv4 / GremlinServer
View on GitHub
A low-code microservices platform designed for legal engineers. Given a document, Gremlin will apply a series of Python scripts to it and…
☆33May 25, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
vcoderun / acpkit
View on GitHub
ACP Kit provides a common adapter for Agent Frameworks.
☆18Updated this week
mhulden / pyfoma
View on GitHub
Python Finite-State Toolkit
☆68Jul 20, 2026Updated last week
markusdr / transducersaurus
View on GitHub
Automatically exported from code.google.com/p/transducersaurus
☆11Apr 1, 2015Updated 11 years ago
KorAP / Koral
View on GitHub
Translation of query languages to serialized KoralQuery protocol
☆15Updated this week
mscarey / AuthoritySpoke
View on GitHub
Reading legal authority for the last time
☆44Jun 30, 2026Updated 3 weeks ago
AnjaneyaTripathi / knowledge_graph
View on GitHub
Knowledge Graph for Legal Documents using Litigation Releases from the SEC website. Classifies into different crimes, extracts relevant i…
☆84Feb 15, 2022Updated 4 years ago
gesceap / prime16
View on GitHub
Nanoloop source files for the album "Prime 16"
☆11Mar 7, 2026Updated 4 months ago