anaistack / cefr-asag-corpusLinks
A corpus of short answers written by learners of English and graded with CEFR levels
☆12Updated 3 years ago
Alternatives and similar repositories for cefr-asag-corpus
Users that are interested in cefr-asag-corpus are comparing it to the libraries listed below
Sorting:
- Multilingual sentence alignment using sentence embeddings☆128Updated last year
- A neural word aligner based on multilingual BERT☆358Updated 3 years ago
- A collection of text simplification datasets and other resources☆50Updated last year
- Repository for CEFR-SP corpus and sentence level assessment☆53Updated last year
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆38Updated 8 months ago
- Improved Sentence Alignment in Linear Time and Space☆184Updated 2 years ago
- OpusFilter - Parallel corpus processing toolkit☆110Updated last month
- MFTE (Multi Feature Tagger of English) Python is the Python version based on Le Foll's MFTE written in Perl. It is extended to include se…☆29Updated 5 months ago
- Annotation Tool for Text Simplification Corpora☆17Updated 2 years ago
- Dutch coreference resolution & dialogue analysis using deterministic rules☆22Updated 2 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆381Updated 2 years ago
- An initiative to collect and distribute resources for co-reference resolution in a unified standard.☆25Updated last year
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆39Updated last year
- https://sites.google.com/site/multidimensionaltagger☆38Updated last year
- Natural Language Processing Research in North American Linguistics Departments☆20Updated 7 months ago
- Natural language understanding benchmarks for Norwegian☆14Updated 2 months ago
- A module to compute textual lexical richness (aka lexical diversity).☆110Updated 2 years ago
- Python Multilingual Ucrel Semantic Analysis System☆32Updated this week
- ☆65Updated 2 months ago
- cLang-8 is a dataset for grammatical error correction.☆110Updated 3 years ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆364Updated 2 years ago
- The University of Pittsburgh English Language Institute Corpus (PELIC) dataset☆24Updated 2 years ago
- Automated Semantic Analysis of Discourse Markers☆10Updated 3 years ago
- This packages up data for the Open Multilingual Wordnet☆55Updated 5 months ago
- Sentence aligner☆119Updated 4 years ago
- A simple toolkit for conducting analyses using corpus methods☆26Updated 3 years ago
- Parallel corpora for the biomedical domain☆50Updated last year
- This is a simple Python package for calculating a variety of lexical diversity indices☆81Updated 2 years ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆51Updated 2 years ago
- Tool for the Automatic Analysis of Syntactic Sophistication and Complexity☆27Updated 2 years ago