nullnull/simstring

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nullnull/simstring)

nullnull / simstring

A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.

☆125

Alternatives and similar repositories for simstring

Users that are interested in simstring are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chokkan / simstring
View on GitHub
SimString
☆114May 16, 2021Updated 5 years ago
tuem / resembla
View on GitHub
☆74Aug 3, 2025Updated 11 months ago
danlou / MedLinker
View on GitHub
ECIR 2020 - MedLinker: Medical Entity Linking with Neural Representations and Dictionary Matching
☆28Jun 12, 2023Updated 3 years ago
vered1986 / LexNET
View on GitHub
LexNET: Integrated Path-based and Distributional Method for Lexical Semantic Relation Classification
☆62Oct 31, 2018Updated 7 years ago
shimo-lab / sembei
View on GitHub
単語分割を経由しない単語埋め込み
☆14Mar 19, 2017Updated 9 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
chakki-works / chariot
View on GitHub
Deliver the ready-to-train data to your NLP model.
☆123Jul 15, 2022Updated 4 years ago
Hironsan / ja.text8
View on GitHub
Japanese text8 corpus for word embedding.
☆111Oct 4, 2017Updated 8 years ago
hiroshi-manabe / CRFSegmenter
View on GitHub
A multi-language segmenter using high-order CRF.
☆17Feb 27, 2020Updated 6 years ago
ikegami-yukino / flati
View on GitHub
Flatten nested iterable object for Python (Pure-Python implementation)
☆29Aug 15, 2025Updated 11 months ago
krandiash / gpt3-nli
View on GitHub
Training a model without a dataset for natural language inference (NLI)
☆25Aug 3, 2020Updated 5 years ago
yagays / embedrank
View on GitHub
Python Implementation of EmbedRank
☆48Mar 19, 2019Updated 7 years ago
ikegami-yukino / neologdn
View on GitHub
Japanese text normalizer for mecab-neologd
☆289May 6, 2026Updated 2 months ago
nullnull / normalizeNumexp
View on GitHub
normalizer of numerical / temporal expression
☆11Sep 2, 2018Updated 7 years ago
6 / kaomoji-json
View on GitHub
4000+ annotated 顔文字 (kaomoji) in JSON (UTF-8 & ShiftJIS)ヽ(`Д´*)ﾉ
☆27Jul 11, 2014Updated 12 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
chanzuckerberg / MedMentions
View on GitHub
A corpus of Biomedical papers annotated with mentions of UMLS entities.
☆347Nov 9, 2021Updated 4 years ago
skozawa / Comainu
View on GitHub
COrpus based Morphological Analyzer with INtegrated User dictionary
☆21Mar 30, 2025Updated last year
chakki-works / Japanese-Company-Lexicon
View on GitHub
☆99Jul 23, 2023Updated 3 years ago
konabuta / Automated-ML-Workshop
View on GitHub
AutoML Workshop (Azure Machine Learning mainly)
☆13Jan 5, 2020Updated 6 years ago
discourse-lab / DiscourseSegmenter
View on GitHub
A collection of various discourse segmenters
☆10Jun 30, 2017Updated 9 years ago
nandenjin / itfdic
View on GitHub
A localized word dictionary asset for University of Tsukuba
☆12Sep 19, 2025Updated 10 months ago
WorksApplications / SudachiPy
View on GitHub
Python version of Sudachi, a Japanese tokenizer.
☆442Oct 7, 2022Updated 3 years ago
mynlp / niilc-qa
View on GitHub
NIILC QA data
☆18Nov 20, 2015Updated 10 years ago
arosh / fmx
View on GitHub
full text search engine based on compact data structures
☆13Jan 26, 2015Updated 11 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
wareya / notmecab-rs
View on GitHub
notmecab-rs is a very basic mecab clone, designed only to do parsing, not training.
☆18Jul 25, 2020Updated 6 years ago
tmu-nlp / simple-jppdb
View on GitHub
A paraphrase database for Japanese text simplification
☆32Mar 12, 2017Updated 9 years ago
uwnlp / qamr
View on GitHub
Question-Answer Meaning Representation
☆48Feb 17, 2022Updated 4 years ago
singletongue / wikipedia-utils
View on GitHub
Utility scripts for preprocessing Wikipedia texts for NLP
☆78Apr 9, 2024Updated 2 years ago
neologd / namelti
View on GitHub
Namelti : The automatic transcription generation library for person name in Katakana
☆24Jul 10, 2023Updated 3 years ago
himkt / pyner
View on GitHub
🌈 Implementation of Neural Network based Named Entity Recognizer (Lample+, 2016) using Chainer.
☆45Dec 8, 2022Updated 3 years ago
ndl-lab / ndlngramdata
View on GitHub
デジタル化資料から作成したOCRテキストデータのngram頻度統計情報のデータセット
☆17Jan 10, 2023Updated 3 years ago
TatsuyaShirakawa / poincare-embedding
View on GitHub
Poincaré Embedding (unofficial)
☆229May 7, 2019Updated 7 years ago
UUDeCART / decart_rule_based_nlp
View on GitHub
☆13Aug 6, 2019Updated 6 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
tatHi / optok
View on GitHub
☆10Aug 26, 2021Updated 4 years ago
Linear95 / BinarySentEmb
View on GitHub
Code for ACL 2019 oral paper - Learning Compressed Sentence Representations for On-Device Text Processing.
☆45Sep 1, 2020Updated 5 years ago
iskana / PBWT-sec
View on GitHub
☆14Feb 7, 2020Updated 6 years ago
mlpnlp / mlpnlp
View on GitHub
機械学習プロフェッショナルシリーズ深層学習による自然言語処理
☆36Jun 15, 2023Updated 3 years ago
GINK03 / keras-seq2seq
View on GitHub
minimal seq2seq of keras
☆24Jun 17, 2017Updated 9 years ago
horita-yuya / rust_block_kit
View on GitHub
Simple wrapper for Slack API implemented by Rust.
☆12Dec 27, 2019Updated 6 years ago
mcoavoux / mtg
View on GitHub
Statistical discontinuous constituent parsing
☆11Feb 15, 2018Updated 8 years ago