openlanguageprofiles / olp-en-cefrj
Open Language Profiles — English profile datasets from CEFR-J
☆92Updated 4 years ago
Related projects: ⓘ
- Multilingual sentence alignment using sentence embeddings☆92Updated 9 months ago
- Hanzipy is a Chinese character and NLP module for Chinese language processing for python. It is primarily written to help provide a frame…☆15Updated 8 months ago
- Repository for CEFR-SP corpus and sentence level assessment☆28Updated this week
- NLP to classify a text's lexile level☆31Updated 2 years ago
- Analyzes the given text and determine what's the vocabulary level based on CEFR levels☆41Updated last year
- 中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.☆136Updated 5 months ago
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆22Updated 3 years ago
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa models for Japanese and other languages☆46Updated last week
- Sentence aligner☆106Updated 3 years ago
- 🈵 Collected resources to learn/study Manchu (Manchurian Language). 满语滿族満州語入門。☆9Updated last year
- Improved Sentence Alignment in Linear Time and Space☆157Updated last year
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆62Updated 6 months ago
- Converts English text to IPA notation☆362Updated last year
- A list of vocabulary lists☆21Updated 4 years ago
- Bilingual term extractor☆50Updated 9 months ago
- Exploring the idea of a generic, language agnostic, CEFR level classifier☆20Updated 6 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆32Updated 6 months ago
- British English pronunciation dictionary☆87Updated 6 years ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆42Updated last year
- Scrape glosbe dicts☆9Updated 2 years ago
- ☆21Updated 4 months ago
- Linguistically analyzed Classical Tibetan texts☆23Updated 3 years ago
- Gather modern English word frequencies from all enwiki articles.☆198Updated 6 months ago
- Tokenizes Chinese texts into words.☆93Updated last year
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆21Updated 2 years ago
- Gale-Church sentence aligner with options for variable parameters☆17Updated 4 years ago
- Morphological Dictionaries for German Language☆27Updated 6 years ago
- NLP system for predicting the reading difficulty level of a text in terms of its CEFR level.☆40Updated 2 years ago
- Code for paper "Kanbun-LM: Reading and Translating Classical Chinese in Japanese Method by Language Models"☆15Updated last year