Featurize words into orthographic and phonological vectors.
☆42May 20, 2023Updated 2 years ago
Alternatives and similar repositories for wordkit
Users that are interested in wordkit are comparing it to the libraries listed below
Sorting:
- Use spaCy for NLP and output to the FoLiA XML format.☆12Feb 27, 2024Updated 2 years ago
- Learning BPE embeddings by first learning a segmentation model and then training word2vec☆19Dec 18, 2022Updated 3 years ago
- T-scan: an analysis tool for dutch texts to assess the complexity of the text, based on original work by Rogier Kraf☆19May 28, 2025Updated 9 months ago
- speakr: A Wrapper for the Phonetic Software Praat☆27Feb 28, 2026Updated 3 weeks ago
- Parser for KAF NAF files written in Python☆16Jul 1, 2021Updated 4 years ago
- A lexicon compiler for non-suffixational morphologies☆13Jan 29, 2026Updated last month
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆25Nov 27, 2021Updated 4 years ago
- spaCy + UDPipe☆167Apr 19, 2022Updated 3 years ago
- A neural network that jointly part-of-speech tags and lemmatizes sentences, boosting accuracy for morphologically-rich languages (Czech, …☆34Apr 5, 2019Updated 6 years ago
- tools for phoneticians and phonologists☆32Dec 5, 2018Updated 7 years ago
- Python code for training models in the ACL paper, "Simple and Effective Paraphrastic Similarity from Parallel Translations".☆22Oct 3, 2019Updated 6 years ago
- Repository for creating models, vocabulary and other necessities for Dutch in Spacey☆11Dec 15, 2016Updated 9 years ago
- (NAACL 2024) Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations☆15Apr 14, 2025Updated 11 months ago
- ☆31Mar 14, 2017Updated 9 years ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Bayes Factors for brms Models☆14May 26, 2022Updated 3 years ago
- Alpino parser and related tools for Dutch☆27Feb 26, 2026Updated 3 weeks ago
- A Python database interface for eXist-db☆15Mar 1, 2026Updated 2 weeks ago
- Text Interchange Formats☆37Nov 26, 2023Updated 2 years ago
- Authorship Verification in Social Media via Attention-based Similarity Learning☆26Sep 23, 2021Updated 4 years ago
- Use GPT-3 to process human conversations and extract context, identify information that would be useful, and suggest data sources to get …☆29Dec 21, 2021Updated 4 years ago
- phonetic transcription for Tibetan☆10Mar 13, 2019Updated 7 years ago
- A Nim library for phylogenetic trees☆11Mar 22, 2024Updated last year
- A minimal, pure Python library to interface with CoNLL-U format files.☆153Dec 5, 2025Updated 3 months ago
- benchmarks for evaluating MT models☆11Jun 26, 2024Updated last year
- An R package for easy and flexible Bayesian Measurement Modeling☆17Mar 10, 2026Updated last week
- LingPy: Python library for quantitative tasks in historical linguistics☆141Dec 6, 2025Updated 3 months ago
- HuCit KB: a knowledge base of classical texts and citable text units.☆11Nov 17, 2021Updated 4 years ago
- Common Lisp bindings for the Tesseract OCR library.☆13Jan 9, 2025Updated last year
- Repo of the Turing's Humanities & Data Science Discussion Group☆13Jul 21, 2022Updated 3 years ago
- Tools and scripts for working with ELAN☆10Aug 4, 2022Updated 3 years ago
- Violets are BLUE. OLS is too. (R package)☆15Aug 11, 2023Updated 2 years ago
- ☆11Nov 16, 2022Updated 3 years ago
- ☆13Mar 28, 2018Updated 7 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Apr 30, 2023Updated 2 years ago
- Interlinear glosses for pandoc☆10Feb 12, 2018Updated 8 years ago
- Twitter Sentiment Analysis - BITS Pilani☆12Mar 27, 2014Updated 11 years ago
- Generic Environment for Context-Aware Correction of Orthography☆22Sep 7, 2022Updated 3 years ago
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago