pushshift / rinzlerLinks
A high performance indexing and search system for managing big data
☆18Updated 6 years ago
Alternatives and similar repositories for rinzler
Users that are interested in rinzler are comparing it to the libraries listed below
Sorting:
- Hidden alignment conditional random field for classifying string pairs.☆36Updated 8 years ago
- Python bindings to the Compact Language Detector☆33Updated 5 years ago
- Misspelling Oblivious Word Embeddings☆201Updated 6 years ago
- Socially-Equitable Language Identification☆78Updated 2 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆171Updated 4 years ago
- Read compressed NDJSON .zst files easily☆35Updated 3 years ago
- Lightning Fast Language Prediction 🚀☆167Updated 5 months ago
- spaCy + UDPipe☆166Updated 3 years ago
- Language Tool style grammar handling with spaCy 2.0☆42Updated 7 years ago
- Hunspell extension for spaCy 2.0.☆94Updated last year
- Parallel Semi-Supervised Latent Dirichlet Allocation☆33Updated 4 years ago
- SimString☆113Updated 4 years ago
- Facebook fastText database in SQLite with Go API☆35Updated 5 years ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆84Updated 4 years ago
- A disk-based key/value store in Python with no dependencies.☆21Updated 10 years ago
- Python package for lexicon; Trie and DAWG implementation.☆56Updated last year
- Golang port of the boilerpipe Java library used for the removal of boilerplate and extraction of text content from HTML documents.☆72Updated 9 months ago
- ☆59Updated 10 years ago
- Load embeddings and featurize your sentences.☆31Updated last year
- Package for performing Reddit-based text analysis☆20Updated 7 years ago
- Tokenizer for Twitter and Reddit data☆45Updated 6 years ago
- Making sense embedding out of word embeddings using graph-based word sense induction