A module to compute textual lexical richness (aka lexical diversity).
☆112Aug 27, 2023Updated 2 years ago
Alternatives and similar repositories for LexicalRichness
Users that are interested in LexicalRichness are comparing it to the libraries listed below
Sorting:
- This is a simple Python package for calculating a variety of lexical diversity indices☆82Sep 15, 2023Updated 2 years ago
- Tool for the automatic assessment of lexical diversity☆14Sep 6, 2025Updated 6 months ago
- Data from the paper "Ghostbuster: Detecting Text Ghostwritten by Large Language Models"☆14May 27, 2024Updated last year
- The official implementation of the EMNLP 2023 paper "Paraphrase Types for Generation and Detection"☆12Oct 20, 2024Updated last year
- PANiC - PAraphrasing Noun-Compounds☆15Apr 6, 2018Updated 7 years ago
- A general-purpose NLP pipeline for Ancient Greek☆28Mar 26, 2024Updated last year
- python package for calculating famous measures in computational linguistics☆15Nov 5, 2024Updated last year
- DreamBank Visualized - An interactive visualization of over 26,000 dream transcriptions☆15Jun 16, 2018Updated 7 years ago
- Whisper finetuning☆16Apr 9, 2025Updated 11 months ago
- An easy-to-use library to extract indices from texts.☆30Sep 7, 2021Updated 4 years ago
- Code for Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution (ACL2021)☆13Jun 2, 2021Updated 4 years ago
- This Guidance demonstrates how to accelerate your content analysis workflows by automating video metadata extraction, intelligence gather…☆13Feb 26, 2025Updated last year
- Arabic Word-Embedding (Word2vec) model training from Wikipedia articles☆11Dec 13, 2018Updated 7 years ago
- Universal Dependency Treebanks in Korean☆38Dec 19, 2021Updated 4 years ago
- Mutual Muses is a crowdsourced transcription project undertaken by the Digital Art History program at the Getty Research Institute☆17May 3, 2018Updated 7 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆28Nov 30, 2024Updated last year
- Research codes for image interestingness☆17Dec 6, 2017Updated 8 years ago
- Avalinguo Audio Dataset: Dataset for Speaker Fluency Level Classification☆13Aug 13, 2018Updated 7 years ago
- Demo server for TREC LiveQA competition☆11Dec 7, 2016Updated 9 years ago
- ☆15Oct 4, 2024Updated last year
- Repository for the CommonLit Ease of Readability Corpus☆24Apr 17, 2024Updated last year
- Code for the paper "Greed is All You Need: An Evaluation of Tokenizer Inference Methods"☆13Nov 26, 2024Updated last year
- Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020☆63Apr 30, 2024Updated last year
- The main controller for services in the cs-insights project through docker-compose.☆13Aug 25, 2023Updated 2 years ago
- PassivePy: A Tool to Automatically Identify Passive Voice in Big Text Data☆23Mar 6, 2024Updated 2 years ago
- Colab, MLflow and papermill are individually great. Together they form a dream team.☆10Jun 9, 2020Updated 5 years ago
- GW2 inventory cleanup tool☆15Apr 5, 2025Updated 11 months ago
- ☆14Oct 1, 2025Updated 5 months ago
- The University of Pittsburgh English Language Institute Corpus (PELIC) dataset☆26Mar 6, 2026Updated 2 weeks ago
- Fragments-Expert is a software package for feature extraction from file fragments and classification among various file formats.☆13Jan 16, 2024Updated 2 years ago
- Dataset of the Samaritan Pentateuch☆11Updated this week
- It is an algorithm analysed the acoustic features of a voice and creates an acoustic classifier - USEFUL for auto-speech-rater☆11Mar 8, 2019Updated 7 years ago
- Code for the ACL 2022 paper "Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning"☆37Dec 5, 2022Updated 3 years ago
- Tropy plugin to import IIIF manifests☆17Mar 11, 2026Updated last week
- XML files for linguistic annotation of the Greek New Testament☆13Jun 12, 2018Updated 7 years ago
- Construction Grammar based BERT☆14Dec 5, 2020Updated 5 years ago
- Dataset accompanying the paper "Adaptive Methods for Real-World Domain Generalization"☆16Aug 17, 2023Updated 2 years ago
- Dataset used to evaluate Skill Extraction systems based on the ESCO skills taxonomy.☆17Jul 18, 2024Updated last year
- Project to convert PDF files to Text files using google OCR☆13May 6, 2024Updated last year