hrafnl / icenlp
IceNLP is an open source Natural Language Processing (NLP) toolkit for analyzing and processing Icelandic text. The toolkit is implemented in Java.
☆21Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for icenlp
- LingPy: Python library for quantitative tasks in historical linguistics☆124Updated 10 months ago
- A lemmatizer for Icelandic text☆16Updated 6 years ago
- Crawler for linguistic corpora☆192Updated 11 months ago
- Icelandic Treebank☆23Updated 5 months ago
- Various utilities for processing the data.☆205Updated this week
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆22Updated last year
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.☆54Updated this week
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆27Updated 4 months ago
- English data☆201Updated last week
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆309Updated 2 years ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆185Updated 4 years ago
- spaCy + UDPipe☆161Updated 2 years ago
- ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with…☆69Updated this week
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆111Updated 6 months ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆60Updated this week
- Multi Tier Annotation Search☆26Updated 3 years ago
- linguistics backend☆40Updated last year
- Detect and align similar passages☆88Updated 2 months ago
- eXtensible Interlinear Glossed Text☆31Updated 2 years ago
- Bilingual sentence aligner (Gale & Church, 1993)☆14Updated 5 years ago
- Python framework for processing Universal Dependencies data☆56Updated last week
- A tokenizer for Icelandic text☆27Updated last month
- TED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted…☆242Updated 8 years ago
- A character-wise tokenizer for morphologically rich languages☆27Updated 4 months ago
- A software to detect text reuse with BLAST.☆14Updated 5 years ago
- A part-of-speech tagger with support for domain adaptation and external resources.☆22Updated 2 years ago
- Phonological CorpusTools☆113Updated last week
- A tool for automatic spelling normalization☆20Updated 3 years ago
- Poetry Annotated with Rhyme Schemes☆21Updated 12 years ago
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆34Updated 3 weeks ago