buschmo / Simple-German-Corpus
Code to create the dataset from "A New Aligned Simple German Corpus
☆10Updated last year
Alternatives and similar repositories for Simple-German-Corpus:
Users that are interested in Simple-German-Corpus are comparing it to the libraries listed below
- TextComplexityDE dataset consists of 1000 sentences in the German language with subjective complexity rating, collected from German learn…☆12Updated 2 years ago
- Deutsches Lyrik Korpus (DLK) / German Poetry Corpus☆18Updated 9 months ago
- An Easy Annotation Tool for Natural Language Processing☆10Updated 9 months ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆68Updated 3 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆37Updated 3 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆79Updated 7 months ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆26Updated 5 months ago
- Dutch coreference resolution & dialogue analysis using deterministic rules☆21Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- Sentiment Corpus for Swedish 🇸🇪 Norwegian 🇳🇴 Danish 🇩🇰 Finnish 🇫🇮 (and English 🏴)☆15Updated 3 years ago
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Updated 2 years ago
- Repo originally for a talk at Normconf☆21Updated 2 years ago
- SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages☆8Updated last year
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆83Updated last week
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Updated 2 years ago
- PassivePy: A Tool to Automatically Identify Passive Voice in Big Text Data☆20Updated 11 months ago
- A lightweight Python library for constructing, processing, and visualizing constituent trees.☆66Updated last month
- Ten Thousand German News Articles Dataset for Topic Classification☆84Updated 2 years ago
- An initiative to collect and distribute resources for co-reference resolution in a unified standard.☆24Updated 9 months ago
- Natural language understanding benchmarks for Norwegian☆14Updated last year
- Python Finite-State Toolkit☆50Updated last month
- Fast computation of Krippendorff's alpha agreement measure in Python.☆139Updated last month
- A software for transferring pre-trained English models to foreign languages☆18Updated last year
- A package for handy processing of semantic graphs such as AMR, with a special focus on standardized evaluation☆20Updated 4 months ago
- Datasets for the task of tracing diachronic semantic shifts in Russian for two large-scale time period pairs (from pre-Soviet to Soviet t…☆14Updated 9 months ago
- This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings l…☆23Updated 3 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆96Updated 9 months ago
- Python wrapper for the CWB to extract concordances and score frequency lists☆20Updated 2 weeks ago
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆80Updated 10 months ago
- Annotation Tool for Text Simplification Corpora☆16Updated last year