Belval / disklistLinks
A python list implementation that uses the disk to handle very large collections
☆14Updated 6 years ago
Alternatives and similar repositories for disklist
Users that are interested in disklist are comparing it to the libraries listed below
Sorting:
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆254Updated 5 years ago
- Notebooks and data associated to constructing and exploring a map of subreddits.☆55Updated 8 years ago
- Get list of common stop words in various languages in Python☆159Updated 3 months ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 7 years ago
- MicroTC is a text classifier with a minimalistic approach☆21Updated last year
- ☆98Updated 4 years ago
- Textpipe: clean and extract metadata from text☆302Updated 4 years ago
- Python wrapper for LanguageTool grammar checker☆329Updated 4 years ago
- Train word embeddings with Gensim and vizualize them with TensorBoard☆34Updated 6 years ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 6 years ago
- Text Mining and Topic Modeling Toolkit for Python with parallel processing power☆191Updated 2 years ago
- Removed at the request of those with deeper wallets than I.☆114Updated 6 years ago
- Python 2 & 3 wrapper around the Stanford Topic Modeling Toolbox. Intended to be used for hassle-free supervised topic classification with…☆58Updated 7 years ago
- Scalable String Similarity Joins in Python☆39Updated last year
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆183Updated 2 years ago
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.☆98Updated 5 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.☆155Updated last year
- Python bindings to the Compact Language Detector☆33Updated 5 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆171Updated 4 years ago
- Interactive GUI for NetworkX graphs☆144Updated last year
- displaCy-ent.js: An open-source named entity visualiser for the modern web☆200Updated 7 years ago
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆142Updated last year
- topic model visualization☆51Updated 10 years ago
- WordNet in JSON format.☆97Updated 5 years ago
- Python library for information extraction of quantities from unstructured text☆118Updated 2 years ago
- Python wrapper around SVDLIBC, a fast library for sparse Singular Value Decomposition☆55Updated 12 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 6 years ago
- NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)☆116Updated last year
- A Python library to calculate the readability score of a text.☆141Updated 8 years ago
- Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.☆108Updated 9 months ago