clips / wordkit
Featurize words into orthographic and phonological vectors.
☆40Updated last year
Related projects ⓘ
Alternatives and complementary repositories for wordkit
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- KenLM extension for spaCy 2.0.☆16Updated 6 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆77Updated 9 months ago
- List of corpora annotated for coreference for different languages☆17Updated 3 months ago
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated 8 months ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Updated last year
- ☆64Updated last year
- Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpo…☆40Updated last week
- Toolkit to compile a comparable/parallel corpus from European Parliament proceedings☆15Updated 4 years ago
- Jupyter extension to visualize dependency structures☆28Updated 6 years ago
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 4 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- Compiled tools, datasets, and other resources for historical text normalization.☆16Updated 5 years ago
- Tools for training and evaluating word embeddings based on subtitles. Published as "subs2vec: Word embeddings from subtitles in 55 langua…☆33Updated 4 years ago
- CONLL-U to Pandas DataFrame☆31Updated 7 years ago
- Convert CoNLL output of a dependency parser into a latex or graphviz tree☆12Updated 4 years ago
- ☆17Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆76Updated 4 months ago
- COMBO is jointly trained tagger, lemmatizer and dependency parser.☆36Updated last year
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆50Updated last year
- CoNLL 2018 Shared Task Team UDPipe-Future☆39Updated 4 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆43Updated 4 years ago
- Corpus of naturalistic stories with annotation and psycholinguistic measures☆50Updated 3 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆27Updated 5 months ago
- A python module to process data for Frame Semantic Parsing☆23Updated 4 years ago
- Gamma Agreement in Python☆43Updated 8 months ago
- Python framework for processing Universal Dependencies data☆57Updated this week
- Lexicons for the Multilingual UCREL Semantic Analysis System☆39Updated last year
- python package for calculating famous measures in computational linguistics☆13Updated 2 weeks ago
- A psycholinguistic modeling toolkit☆24Updated last week