nipunsadvilkar/pySBD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nipunsadvilkar/pySBD)

nipunsadvilkar / pySBD

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

☆927

Alternatives and similar repositories for pySBD

Users that are interested in pySBD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kevinlu1248 / pyate
View on GitHub
PYthon Automated Term Extraction
☆318Feb 8, 2023Updated 3 years ago
malep2007 / C-Programming-Assignment
View on GitHub
This is a repository for students to solve some problems in the c files provided. Instructions are provided below
☆11Aug 28, 2023Updated 2 years ago
fnl / syntok
View on GitHub
Text tokenization and sentence segmentation (segtok v2)
☆211Mar 12, 2022Updated 4 years ago
jenojp / negspacy
View on GitHub
spaCy pipeline object for negating concepts in text
☆280Apr 20, 2026Updated 3 months ago
segment-any-text / wtpsplit
View on GitHub
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
☆1,320Jul 6, 2026Updated 2 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
R1j1t / contextualSpellCheck
View on GitHub
✔️Contextual word checker for better suggestions (not actively maintained)
☆420Jan 31, 2025Updated last year
NorskRegnesentral / skweak
View on GitHub
skweak: A software toolkit for weak supervision applied to NLP tasks
☆925Sep 2, 2024Updated last year
gandersen101 / spaczz
View on GitHub
Fuzzy matching and more functionality for spaCy.
☆258Jul 6, 2024Updated 2 years ago
sohutv / cachecloud-client
View on GitHub
cachecloud客户端项目
☆84Dec 30, 2020Updated 5 years ago
chartbeat-labs / textacy
View on GitHub
NLP, before and after spaCy
☆2,239Sep 22, 2023Updated 2 years ago
explosion / spacy-transformers
View on GitHub
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
☆1,408Mar 27, 2026Updated 3 months ago
MaartenGr / PolyFuzz
View on GitHub
Fuzzy string matching, grouping, and evaluation.
☆801Jul 10, 2025Updated last year
allenai / scispacy
View on GitHub
A full spaCy pipeline and models for scientific/biomedical documents.
☆1,977Dec 4, 2025Updated 7 months ago
mmxgn / spacy-clausie
View on GitHub
Implementation of the ClausIE information extraction system for python+spacy
☆230Aug 8, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
flairNLP / flair
View on GitHub
A very simple framework for state-of-the-art Natural Language Processing (NLP)
☆14,382Oct 27, 2025Updated 8 months ago
gaoyf / pinpoint
View on GitHub
Pinpoint is an open source APM (Application Performance Management) tool for large-scale distributed systems written in Java.
☆17Jan 18, 2018Updated 8 years ago
ICLRandD / Blackstone
View on GitHub
A spaCy pipeline and model for NLP on unstructured legal text.
☆693Jul 16, 2024Updated 2 years ago
explosion / spacy-stanza
View on GitHub
💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
☆747Aug 15, 2024Updated last year
DerwenAI / pytextrank
View on GitHub
Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
☆2,219Jun 24, 2026Updated last month
msg-systems / coreferee
View on GitHub
Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…
☆198Dec 18, 2022Updated 3 years ago
MaartenGr / KeyBERT
View on GitHub
Minimal keyword extraction with BERT
☆4,207May 13, 2026Updated 2 months ago
babylonhealth / hmrb
View on GitHub
☆70Nov 30, 2022Updated 3 years ago
webis-de / small-text
View on GitHub
Active Learning for Text Classification in Python
☆646May 24, 2026Updated 2 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
mediacloud / sentence-splitter
View on GitHub
Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
☆258Nov 7, 2022Updated 3 years ago
erre-quadro / spikex
View on GitHub
SpikeX - SpaCy Pipes for Knowledge Extraction
☆403Jul 30, 2021Updated 4 years ago
mpuig / spacy-lookup
View on GitHub
Named Entity Recognition based on dictionaries
☆238Mar 3, 2019Updated 7 years ago
jfilter / clean-text
View on GitHub
🧹 Python package for text cleaning
☆1,026May 15, 2026Updated 2 months ago
zaibacu / rita-dsl
View on GitHub
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…
☆70Updated this week
huggingface / sentence-transformers
View on GitHub
State-of-the-Art Embeddings, Retrieval, and Reranking
☆18,944Updated this week
plasticityai / magnitude
View on GitHub
A fast, efficient universal vector embedding utility package.
☆1,666Aug 3, 2023Updated 2 years ago
explosion / floret
View on GitHub
🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy
☆343Apr 25, 2025Updated last year
facebookresearch / LASER
View on GitHub
Language-Agnostic SEntence Representations
☆3,661May 2, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Brotherc / openplatform
View on GitHub
企业级开放平台，包括文档中心、API中心等
☆130Sep 20, 2025Updated 10 months ago
MilaNLProc / contextualized-topic-models
View on GitHub
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…
☆1,272Jul 24, 2025Updated last year
huggingface / setfit
View on GitHub
Efficient few-shot learning with Sentence Transformers
☆2,777May 26, 2026Updated 2 months ago
HLasse / TextDescriptives
View on GitHub
A Python library for calculating a large variety of metrics from text
☆366May 5, 2026Updated 2 months ago
atpuxiner / pytcli
View on GitHub
This is a pytcli. (A command line for python toollib package)
☆108Jul 9, 2022Updated 4 years ago
ddangelov / Top2Vec
View on GitHub
Top2Vec learns jointly embedded topic, document and word vectors.
☆3,102Nov 14, 2024Updated last year
makcedward / nlpaug
View on GitHub
Data augmentation for NLP
☆4,663Updated this week