AzBuki-ML / public-dataLinks
Custom-built Bulgarian language data sets, used by АзБуки.ML for sentiment analysis, text classification, summarisation and generation. Open-source & free to use in any ML project.
☆19Updated 2 years ago
Alternatives and similar repositories for public-data
Users that are interested in public-data are comparing it to the libraries listed below
Sorting:
- ☆33Updated last year
- LoanPy is a linguistic toolkit for rule-based prediction and evaluation of loanword adaptation and historical reconstructions and can be …☆16Updated last year
- Greek open source Morphological dictionary and application of it to Greek spelling tools☆37Updated 2 years ago
- Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Langua…☆39Updated 3 years ago
- Latin BERT☆70Updated last year
- Ancient Greek language models for spaCy☆35Updated 10 months ago
- LingPy: Python library for quantitative tasks in historical linguistics☆139Updated 2 months ago
- Ancient Greek lemmatisation tool☆22Updated 6 months ago
- Extension for pie to include taggers with their models and pre/postprocessors☆11Updated last year
- A character-wise tokenizer for morphologically rich languages☆30Updated 4 months ago
- Metrical position in Greek hexameter.☆13Updated 2 weeks ago
- XML files for the works in the First Thousand Years of Greek Project. Please see our Wiki on how to contribute.☆107Updated last week
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆515Updated last year
- All languages stopwords collection☆476Updated 2 years ago
- A general-purpose NLP pipeline for Ancient Greek☆25Updated last year
- Machine-readable lists of lemma-token pairs in 23 languages.☆358Updated 4 years ago
- Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Ho…☆22Updated 3 years ago
- Linguistic Reconstruction with LingPy☆15Updated last year
- Programming for Historians☆17Updated 3 years ago
- Official releases of the PROIEL treebank of ancient Indo-European languages☆39Updated 2 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆34Updated 7 months ago
- R package which provides access to the DraCor API.☆34Updated 4 months ago
- Ancient Greek and Latin stopwords for textual analysis☆18Updated 2 years ago
- Public repository for Coptic SCRIPTORIUM Corpora Releases☆40Updated last month
- ☆33Updated last week
- GLEM is a lemmatizer for Ancient Greek.☆25Updated 2 years ago
- Python Finite-State Toolkit☆60Updated last month
- Python library for automatic analysis of Ancient Greek hexameter. The algorithm uses linguistic rules and finite-state technology.☆22Updated last year
- ☆16Updated 3 months ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆208Updated 2 years ago