stephantul / reachLinks
Load embeddings and featurize your sentences.
β31Updated last year
Alternatives and similar repositories for reach
Users that are interested in reach are comparing it to the libraries listed below
Sorting:
- β30Updated 3 years ago
- π§ͺ Cutting-edge experimental spaCy components and featuresβ105Updated last year
- Official details for: [1803.08493] Context is Everything: Finding Meaning Statistically in Semantic Spacesβ39Updated 6 years ago
- An example of how to use spaCy for extremely large files without running into memory issuesβ36Updated 3 years ago
- Learning BPE embeddings by first learning a segmentation model and then training word2vecβ19Updated 3 years ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtitiesβ118Updated 6 months ago
- β70Updated 3 years ago
- spaCy match and replace, maintaining conjugationβ36Updated 3 years ago
- An unsupervised compound splitterβ42Updated 6 years ago
- β25Updated 5 years ago
- A web application tagging and retrieval of arguments in textβ29Updated 2 years ago
- A thin wrapper around the DBpedia Spotlight HTTP APIβ25Updated 8 years ago
- Tool for parsing and converting various span encoding schemes.β23Updated last year
- spaCy + UDPipeβ165Updated 3 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doβ¦β81Updated last year
- Inter-annotator agreement for Doccanoβ28Updated 5 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplateβ¦β52Updated 5 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic feβ¦β171Updated 4 years ago
- Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searβ¦β86Updated 4 years ago
- Fast Word Clustering Softwareβ79Updated 11 months ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", preβ¦β84Updated 4 years ago
- Getting interpretable dimensions in word embedding spaces.β15Updated 2 years ago
- Code and data for segmentation experiments.β20Updated 10 years ago
- Lightning Fast Language Prediction πβ167Updated 4 months ago
- β68Updated 3 years ago
- Code and data accompanying the paper "Approaching nested named entity recognition with parallel LSTM-CRFs."β27Updated 3 years ago
- β59Updated 10 years ago
- Robust and Fast tokenizations alignment library for Rust and Python https://tamuhey.github.io/tokenizations/β193Updated 2 years ago
- A python module for word inflections designed for use with spaCy.β93Updated 5 years ago
- ALMa (Active Learning Manager) Keeps track of labeled and unlabeled data for active learningβ42Updated 5 years ago