nipunsadvilkar / pySBDView external linksLinks
🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.
☆900Aug 20, 2024Updated last year
Alternatives and similar repositories for pySBD
Users that are interested in pySBD are comparing it to the libraries listed below
Sorting:
- PYthon Automated Term Extraction☆318Feb 8, 2023Updated 3 years ago
- Text tokenization and sentence segmentation (segtok v2)☆208Mar 12, 2022Updated 3 years ago
- spaCy pipeline object for negating concepts in text☆282Jun 16, 2025Updated 7 months ago
- Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.☆1,237Jan 31, 2026Updated 2 weeks ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- Fuzzy matching and more functionality for spaCy.☆258Jul 6, 2024Updated last year
- Implementation of the ClausIE information extraction system for python+spacy☆227Aug 8, 2022Updated 3 years ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆418Jan 31, 2025Updated last year
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,403Nov 7, 2025Updated 3 months ago
- NLP, before and after spaCy☆2,232Sep 22, 2023Updated 2 years ago
- Fuzzy string matching, grouping, and evaluation.☆788Jul 10, 2025Updated 7 months ago
- A spaCy pipeline and model for NLP on unstructured legal text.☆672Jul 16, 2024Updated last year
- A full spaCy pipeline and models for scientific/biomedical documents.☆1,921Dec 4, 2025Updated 2 months ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,264Jul 24, 2025Updated 6 months ago
- Active Learning for Text Classification in Python☆639Feb 1, 2026Updated last week
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆198Dec 18, 2022Updated 3 years ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,209Feb 1, 2026Updated last week
- 🧹 Python package for text cleaning☆1,002Jan 28, 2026Updated 2 weeks ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)☆14,355Oct 27, 2025Updated 3 months ago
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆744Aug 15, 2024Updated last year
- Minimal keyword extraction with BERT☆4,106Feb 3, 2026Updated last week
- Top2Vec learns jointly embedded topic, document and word vectors.☆3,105Nov 14, 2024Updated last year
- SpikeX - SpaCy Pipes for Knowledge Extraction☆402Jul 30, 2021Updated 4 years ago
- A fast, efficient universal vector embedding utility package.☆1,652Aug 3, 2023Updated 2 years ago
- Efficient few-shot learning with Sentence Transformers☆2,678Dec 11, 2025Updated 2 months ago
- Named Entity Recognition based on dictionaries☆241Mar 3, 2019Updated 6 years ago
- Data augmentation for NLP☆4,645Jun 24, 2024Updated last year
- State-of-the-Art Text Embeddings☆18,225Updated this week
- ☆70Nov 30, 2022Updated 3 years ago
- Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.☆1,751Dec 20, 2023Updated 2 years ago
- Information extraction from English and German texts based on predicate logic☆394Jul 8, 2022Updated 3 years ago
- Language-Agnostic SEntence Representations☆3,658May 2, 2024Updated last year
- Leveraging BERT and c-TF-IDF to create easily interpretable topics.☆7,397Jan 31, 2026Updated 2 weeks ago
- High-accuracy NLP parser with models for 11 languages.☆905Jan 10, 2022Updated 4 years ago
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆220Jan 20, 2025Updated last year
- Single-document unsupervised keyword extraction☆1,822Updated this week
- A python module for English lemmatization and inflection.☆273Sep 14, 2023Updated 2 years ago
- A Python library for calculating a large variety of metrics from text☆359Jan 30, 2026Updated 2 weeks ago
- Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing☆788Jul 22, 2025Updated 6 months ago