droher / etymology-db
An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship types.
☆92Updated 10 months ago
Alternatives and similar repositories for etymology-db:
Users that are interested in etymology-db are comparing it to the libraries listed below
- The curation repository for the data behind Concepticon.☆38Updated last month
- eXtensible Interlinear Glossed Text☆32Updated 2 years ago
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆24Updated 3 years ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆89Updated last year
- This repository contains code behind the visualization of the Wikimedia tool etytree at http://tools.wmflabs.org/etytree/☆51Updated 5 years ago
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆37Updated 6 months ago
- Interactive visualization of Wiktionary words and etymologies.☆92Updated 2 months ago
- Making the public domain Loebs more easily downloadable. Data at https://github.com/ryanfb/loebolus-data☆95Updated last week
- Python API to access glottolog/glottolog☆29Updated 5 months ago
- Collaborative data curation for Glottolog☆160Updated last week
- This is a collection of sentence-level aligned Sanskrit-Tibetan Etexts.☆15Updated 2 years ago
- Analyse rhyme scheme, metre and form of poems☆130Updated 3 years ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆48Updated last year
- CLDF: Cross-Linguistic Data Formats - the specification☆57Updated last year
- AUTOTYP data export☆41Updated last year
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆24Updated 4 years ago
- SegBo: A database of borrowed sounds in the world’s languages☆16Updated last year
- The Global WordNet Association Collaborative Inter-Lingual Index☆42Updated 5 months ago
- Etymological graphs based on Wiktionary dumps☆21Updated last month
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆106Updated 6 years ago
- The World Atlas of Language Structures☆60Updated 6 months ago
- The official repository for the The Project Dialogism Novel Corpus, a dataset of annotated quotations in full-length English novels.☆39Updated last year
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆30Updated 3 years ago
- A language evolution simulator, using realistic phonetic changes.☆38Updated 2 years ago
- Helsinki Finite-State Technology (library and application suite)☆129Updated last week
- Yet another search platform for linguistic corpora.☆22Updated last week
- A modern, interlingual wordnet interface for Python☆241Updated this week
- A textual corpus database for the digital humanities.☆62Updated 4 years ago
- The Open English WordNet☆533Updated 2 months ago
- Resources for conservation, development, and documentation of low resource (human) languages.☆413Updated 2 weeks ago