Next-generation Punkt sentence boundary detection with zero dependencies
☆30Nov 18, 2025Updated 5 months ago
Alternatives and similar repositories for nupunkt
Users that are interested in nupunkt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development☆22Jul 24, 2023Updated 2 years ago
- A collection of outcomes and discoveries from our legal AI research projects☆26Apr 10, 2026Updated 3 weeks ago
- ☆28Dec 20, 2021Updated 4 years ago
- Legal Matter Standard Specification (LMSS) library for Python☆17Nov 14, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 🔢 Work with static vector models☆39Apr 21, 2025Updated last year
- A reddit bot that finds original publish dates on linked articles.☆10Nov 30, 2024Updated last year
- Evaluate language models using multiple choice items☆13Mar 6, 2026Updated 2 months ago
- Getting interpretable dimensions in word embedding spaces.☆15Jul 6, 2023Updated 2 years ago
- A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.☆20Jul 5, 2024Updated last year
- One library to split them all: Sentence, Code, Docs. Chunk smarter, not harder — built for LLMs, RAG pipelines, and beyond.☆77Updated this week
- ChatGPT with access to the internet☆25Jun 16, 2023Updated 2 years ago
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Dec 2, 2024Updated last year
- Bajo los adoquines, la PLAYA 🏖️☆17Apr 13, 2026Updated 3 weeks ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- noslegal taxonomy facets and release notes☆43Aug 13, 2025Updated 8 months ago
- Nearly Inference Free Embeddings: make your RAG queries 500x faster☆77Apr 27, 2026Updated last week
- Learning BPE embeddings by first learning a segmentation model and then training word2vec☆19Dec 18, 2022Updated 3 years ago
- Alternative robots parser module for Python☆22Apr 8, 2026Updated last month
- A low-code microservices platform designed for legal engineers. Given a document, Gremlin will apply a series of Python scripts to it and…☆33May 25, 2022Updated 3 years ago
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.☆59May 3, 2024Updated 2 years ago
- Gantry provides an API that streamlines running experiments in Beaker☆33Apr 8, 2026Updated last month
- Plug-and-play document AI with zero-shot models.☆124May 1, 2026Updated last week
- SALI LMSS: Legal Matter Standard Specification☆77Mar 10, 2026Updated last month
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Translation of query languages to serialized KoralQuery protocol☆14Apr 29, 2026Updated last week
- NER model for 10K and 10Q SEC filings☆14Jun 18, 2020Updated 5 years ago
- Answer questions against collections stored in LLM using Retrieval Augmented Generation☆29Jan 29, 2024Updated 2 years ago
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Feb 27, 2023Updated 3 years ago
- Jazz Structure Dataset☆35Jul 11, 2024Updated last year
- Download client for legal opinions☆13Jan 26, 2025Updated last year
- code and data used to build a training dataset for dragnet models☆10Nov 29, 2020Updated 5 years ago
- Create dynamic web scraper in Objective-C or Ruby!☆24Mar 28, 2015Updated 11 years ago
- Code and data for the paper "Soft Gazetteers for Low-resource Named Entity Recognition"☆19Nov 3, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…☆21Mar 20, 2026Updated last month
- Source code accompanying the ICLR2020 publication 'Massively Multilingual Sparse Word Representations' https://openreview.net/forum?id=Hy…☆12Aug 15, 2023Updated 2 years ago
- A Structured Output Benchmark whose 'ground-truth' is actually right☆19Dec 5, 2025Updated 5 months ago
- FlexiTokens☆22Dec 27, 2025Updated 4 months ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Oct 18, 2025Updated 6 months ago
- Nano Bots for Obsidian: small, AI-powered bots that can be easily shared as a single file, designed to support multiple providers such as…☆15Jan 13, 2024Updated 2 years ago
- Automatically sync your pre-commit hooks version from your PDM, Poetry or UV lockfile, and install them automatically.☆30May 2, 2026Updated last week