dkalpakchi / awesome-swedish-nlpLinks
A curated list of resources for natural language processing (NLP) in Swedish
☆25Updated 2 years ago
Alternatives and similar repositories for awesome-swedish-nlp
Users that are interested in awesome-swedish-nlp are comparing it to the libraries listed below
Sorting:
- A tokenizer and sentence splitter for German and English web and social media texts.☆147Updated 9 months ago
- A module to compute textual lexical richness (aka lexical diversity).☆110Updated 2 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆97Updated 2 years ago
- 🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪☆77Updated 4 years ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆498Updated 10 months ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆156Updated 2 years ago
- BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s …☆139Updated 2 years ago
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated last year
- ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large c…☆602Updated this week
- An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo☆280Updated last year
- DaNLP is a repository for Natural Language Processing resources for the Danish Language.☆207Updated 7 months ago
- Text tokenization and sentence segmentation (segtok v2)☆206Updated 3 years ago
- A Python library for calculating a large variety of metrics from text☆350Updated 9 months ago
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆98Updated 9 months ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆188Updated 4 months ago
- 🧪 Cutting-edge experimental spaCy components and features☆101Updated last year
- Ten Thousand German News Articles Dataset for Topic Classification☆86Updated 2 years ago
- A python module for English lemmatization and inflection.☆270Updated 2 years ago
- Concept Modeling: Topic Modeling on Images and Text☆214Updated 10 months ago
- Unannotated Spanish 3 Billion Words Corpora☆104Updated 2 years ago
- This repository contains EmoBank, a large-scale text corpus manually annotated with emotion according to the psychological Valence-Arousa…☆213Updated 2 years ago
- Measure the readability of a given text using surface characteristics☆80Updated 7 months ago
- Linguistic and stylistic complexity measures for (literary) texts☆84Updated last year
- 📂 Additional lookup tables and data resources for spaCy☆107Updated 3 months ago
- Bilingual term extractor☆58Updated last year
- Easier Automatic Sentence Simplification Evaluation☆161Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆254Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆175Updated 3 months ago
- Norwegian Named Entities annotations on top of NDT (Norwegian Dependency Treebank)☆69Updated last year
- xfspell — the Transformer Spell Checker☆188Updated 5 years ago