wikimedia/sentencex

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wikimedia/sentencex)

wikimedia / sentencex

A sentence segmentation library with wide language support optimized for speed and utility.

☆134

Alternatives and similar repositories for sentencex

Users that are interested in sentencex are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GoFigure-LANL / VisHash
View on GitHub
Visual Hash for matching copies of visually similar images.
☆16Mar 17, 2025Updated last year
LBeaudoux / iso639
View on GitHub
A fast, comprehensive, ISO 639 library.
☆46Aug 12, 2025Updated 10 months ago
Softcatala / nmt-models
View on GitHub
Softcatalà neural translation models
☆22Jan 17, 2026Updated 5 months ago
PsichiX / Keket
View on GitHub
Database-like Asset management on top of ECS storage
☆14Jun 24, 2026Updated 2 weeks ago
Nudin / makesense
View on GitHub
Generate Senses for Lexemes on Wikidata from already existing Wikidata Items
☆12Feb 8, 2026Updated 5 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
browsermt / students
View on GitHub
Efficient teacher-student models and scripts to make them
☆57Dec 16, 2023Updated 2 years ago
explosion / spacy-vectors-builder
View on GitHub
🌸 Train floret vectors
☆18May 4, 2023Updated 3 years ago
abelsiqueira / Breakage
View on GitHub
GitHub Action workflow to tests breakage of Julia packages on pull requests
☆11Mar 20, 2021Updated 5 years ago
cozheyuanzhangde / Forward-Forward
View on GitHub
Hinton's Forward-Forward Algorithm for Deep Learning
☆10Feb 6, 2023Updated 3 years ago
laurieburchell / open-lid-dataset
View on GitHub
Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)
☆77Apr 1, 2025Updated last year
Maximkaaa / cargo-warloc
View on GitHub
Smart Rust LOC counter, distinguising code, examples, tests and doc comments
☆24Apr 19, 2026Updated 2 months ago
wikimedia / performance-WikimediaDebug
View on GitHub
Browser extension for Chrome and Firefox. Mirror from https://gerrit.wikimedia.org/g/performance/WikimediaDebug/.
☆16Jun 16, 2026Updated 3 weeks ago
ymoslem / OpenNMT-Web-Interface
View on GitHub
Machine Translation Web Interface for OpenNMT-py
☆26Dec 24, 2021Updated 4 years ago
Freja-eID / frejaeidclient
View on GitHub
Java client library for integration with Freja eID
☆12Feb 27, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
agtabesh / lsh-js
View on GitHub
Locality-Sensitive Hashing implementation in node.js for fast and scalable approximate nearest neighbors search
☆13Mar 31, 2019Updated 7 years ago
graphcore-research / jax-scalify
View on GitHub
JAX Scalify: end-to-end scaled arithmetics
☆18Oct 30, 2024Updated last year
CunningLogic / BurritoRoot
View on GitHub
Root Exploit by TeamAndIRC, released to root the Kindle Fire 6.2.1
☆19Jan 7, 2012Updated 14 years ago
fnielsen / ordia
View on GitHub
Wikidata lexemes presentations
☆23Jan 30, 2026Updated 5 months ago
mediawiki-utilities / python-mwcites
View on GitHub
☆40Jun 22, 2018Updated 8 years ago
mediawiki-utilities / python-mwsql
View on GitHub
A set of utilities for processing MediaWiki SQL dump data
☆20Feb 19, 2024Updated 2 years ago
jbelyeu / unfazed
View on GitHub
Unfazed by genomic variant phasing
☆28May 26, 2024Updated 2 years ago
nipunsadvilkar / pySBD
View on GitHub
🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.
☆920Aug 20, 2024Updated last year
johnsamuelwrites / awesome-wikidata
View on GitHub
Curated list of Wikidata Projects
☆26Mar 3, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Geralt-Targaryen / MC-Evaluation
View on GitHub
☆13May 21, 2024Updated 2 years ago
KIZI / sparqlab
View on GitHub
Lab for exercising SPARQL
☆12Jan 16, 2022Updated 4 years ago
anastaw / Meedan-Memory
View on GitHub
Meedan's Open Source Arabic/English Translation Memory
☆33Nov 4, 2009Updated 16 years ago
segment-any-text / wtpsplit
View on GitHub
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
☆1,308Apr 11, 2026Updated 2 months ago
arogozhnikov / adamw_bfloat16
View on GitHub
AdamW optimizer for bfloat16 models in pytorch 🔥.
☆40Jun 16, 2024Updated 2 years ago
jacquerie / biorxiv-cli
View on GitHub
A Python wrapper for the bioRxiv API.
☆11Aug 18, 2021Updated 4 years ago
LibreTranslate / nllu
View on GitHub
No Language Left Unlocked: scalable backtranslation of NLLB models
☆14Aug 4, 2025Updated 11 months ago
HeardLibrary / linked-data
View on GitHub
Documentation and Data related to the Linked Data and Wikidata Working Groups
☆23Apr 16, 2024Updated 2 years ago
nem6ishi / wat17
View on GitHub
This is the official code used for WAT 2017 Description Paper titled A Bag of Useful Tricks for Practical Neural Machine Translation: Emb…
☆12Oct 24, 2017Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
rasyosef / splade-index
View on GitHub
Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba
☆38Oct 16, 2025Updated 8 months ago
csisc / OpenCitations-Bot
View on GitHub
A bot to add citation data from OpenCitations to Wikidata
☆12May 23, 2023Updated 3 years ago
Sunkyoung / Compare-tokenizer
View on GitHub
Tokenizer 비교 실험
☆11Jan 3, 2022Updated 4 years ago
Helsinki-NLP / Opus-MT
View on GitHub
Open neural machine translation models and web services
☆831Feb 23, 2026Updated 4 months ago
masakhane-io / africomet
View on GitHub
COMET for African languages
☆11Jan 24, 2025Updated last year
thanhan / seqcrowd-acl17
View on GitHub
☆11Jul 6, 2023Updated 3 years ago
tduyng / nvim
View on GitHub
My minimal clean fast Neovim config 💚 ~20 plugins of pure joy.
☆33Apr 2, 2026Updated 3 months ago