English word segmentation, written in pure-Python, and based on a trillion-word corpus.
☆378Dec 26, 2022Updated 3 years ago
Alternatives and similar repositories for python-wordsegment
Users that are interested in python-wordsegment are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python implementation of Tribool data type.☆21Jan 22, 2020Updated 6 years ago
- Python module for computing statistics and regression in a single pass.☆101Jul 13, 2021Updated 4 years ago
- Python pattern matching like functional languages.☆161Feb 14, 2021Updated 5 years ago
- Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.☆873Feb 19, 2023Updated 3 years ago
- Recurrent versus Recursive Approaches Towards Compositionality in Semantic Vector Spaces.☆13Sep 22, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆631Jun 24, 2021Updated 5 years ago
- Building and Using A Seed Corpus for the Human Language Project☆11Feb 9, 2018Updated 8 years ago
- The New York Times English-Chinese parallel corpus☆17Dec 21, 2021Updated 4 years ago
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.☆97Oct 1, 2020Updated 5 years ago
- Easy language identification of 380 languages☆17Dec 2, 2019Updated 6 years ago
- ☆15Jul 6, 2016Updated 9 years ago
- Thoughts toward and tutorial on corpus-driven narrative generation☆25Nov 5, 2020Updated 5 years ago
- Spell correct entire sentences using nltk freqdist and symspell☆18Jul 3, 2017Updated 9 years ago
- Code for paper "Cross-Domain Slot Filling as Machine Reading Comprehension" in IJCAI 2021☆11Aug 24, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Python Sorted Container Types: Sorted List, Sorted Dict, and Sorted Set☆3,963Mar 8, 2024Updated 2 years ago
- The Non-Official Characterization (NOC) List is a knowledge-base containing semantic triples about famous people, living and dead, fictio…☆24Jan 9, 2019Updated 7 years ago
- Efficient Counter that uses a limited (bounded) amount of memory regardless of data size.☆932Nov 20, 2022Updated 3 years ago
- Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.☆2,891Aug 10, 2024Updated last year
- NLP, before and after spaCy☆2,240Sep 22, 2023Updated 2 years ago
- A machine learning dataset consisting of 5000 images of pebbles☆18Nov 19, 2018Updated 7 years ago
- Text pattern search using marisa-trie☆19Jan 26, 2025Updated last year
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…☆874Jun 27, 2026Updated last week
- Course page for KU course on text data and deep learning https://kurser.ku.dk/course/a%c3%98kk08401u/2019-2020☆10May 15, 2020Updated 6 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Port of Google's language-detection library to Python.☆1,893Mar 3, 2025Updated last year
- 写的一个简单的算法,用来将英语单词分出词根词缀,并给出最优解☆20Oct 24, 2018Updated 7 years ago
- Python search module for fast approximate string matching☆54Jan 25, 2023Updated 3 years ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆206Oct 6, 2020Updated 5 years ago
- Extract Keywords from sentence or Replace keywords in sentences.☆5,714Apr 13, 2025Updated last year
- Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding☆11May 19, 2023Updated 3 years ago
- A tiny script to convert your mdx dictionary file to CSV☆11Dec 22, 2018Updated 7 years ago
- Use spaCy for NLP and output to the FoLiA XML format.☆12Feb 27, 2024Updated 2 years ago
- The Average Novel☆10Dec 2, 2017Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- sketching algorithms implemented in chapel and python☆10Jun 8, 2017Updated 9 years ago
- Spoken Language Translation System☆14Jun 25, 2019Updated 7 years ago
- Phrasal verbs training site: https://little-brother.github.io/english-phrasal-verbs/☆15Nov 8, 2018Updated 7 years ago
- 外语单词批量查询软件☆15Jan 11, 2023Updated 3 years ago
- A library for Multilingual Unsupervised or Supervised word Embeddings☆3,244Aug 31, 2022Updated 3 years ago
- Corpus of Annotations for Misspelings☆29Jul 31, 2023Updated 2 years ago
- SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm☆3,444Apr 21, 2026Updated 2 months ago