uhermjakob / utokenLinks
universal tokenizer
☆17Updated 3 years ago
Alternatives and similar repositories for utoken
Users that are interested in utoken are comparing it to the libraries listed below
Sorting:
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- Transform TMX to text☆27Updated 2 years ago
- Runnable morphological analysis tools from the UniMorph project☆16Updated 6 years ago
- Python framework for processing Universal Dependencies data☆57Updated last week
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- English Resource Grammar☆21Updated 3 weeks ago
- bilingual dictionary extractor from parallel corpora☆22Updated 10 years ago
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆30Updated 5 years ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆57Updated 2 years ago
- Alpino parser and related tools for Dutch☆24Updated this week
- Efficient teacher-student models and scripts to make them☆51Updated last year
- 💫 A spaCy package for Yohei Tamura's Rust tokenizations library☆29Updated 3 weeks ago
- Finds linguistic patterns effortlessly☆36Updated last year
- Python Finite-State Toolkit☆56Updated this week
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆38Updated 3 years ago
- Extracts plain text, language identification and more metadata from WARC records☆22Updated 3 months ago
- bin files☆13Updated 4 months ago
- ☆74Updated 3 months ago
- Verb∋Net, a French translation of VerbNet☆10Updated 7 years ago
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆22Updated 3 years ago
- Python libraries for DELPH-IN☆82Updated last week
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆69Updated 3 years ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated this week
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆66Updated 3 weeks ago
- Efficient Low-Memory Aligner☆145Updated 5 months ago
- ☆17Updated 2 years ago
- A python library / model for creating co-references between AMR graph nodes.☆10Updated 2 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 3 years ago
- Official repository for Semlink resources☆34Updated 3 years ago
- linguistics tree drawing to SVG in python, aimed at Jupyter☆65Updated 10 months ago