droher / etymology-db
An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship types.
☆70Updated 4 months ago
Related projects: ⓘ
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆21Updated 2 years ago
- Interactive visualization of Wiktionary words and etymologies.☆91Updated 3 weeks ago
- A cloud-based, open-source system for writing and publishing dictionaries.☆85Updated 8 months ago
- This repository contains code behind the visualization of the Wikimedia tool etytree at http://tools.wmflabs.org/etytree/☆50Updated 4 years ago
- A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars☆32Updated 6 months ago
- eXtensible Interlinear Glossed Text☆31Updated 2 years ago
- A language evolution simulator, using realistic phonetic changes.☆37Updated last year
- Grammatical Framework's Resource Grammar Library (RGL)☆52Updated 2 weeks ago
- A Python module to discover the etymology of words☆144Updated 4 months ago
- Latin BERT☆56Updated 2 months ago
- Helsinki Finite-State Technology (library and application suite)☆119Updated 3 weeks ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆42Updated last year
- The World Atlas of Language Structures☆51Updated 2 months ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆89Updated 6 years ago
- Official repository for Semlink resources☆32Updated 2 years ago
- A modern, interlingual wordnet interface for Python☆207Updated 9 months ago
- Etymological graphs based on Wiktionary dumps☆18Updated last year
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆22Updated 3 years ago
- University of Colorado VerbNet☆99Updated 4 months ago
- CLDF: Cross-Linguistic Data Formats - the specification☆53Updated 5 months ago
- Grammatical Framework core: compiler, shell & runtimes☆129Updated 2 weeks ago
- The curation repository for the data behind Concepticon.☆32Updated this week
- Automatically exported from code.google.com/p/foma☆115Updated 2 months ago
- The Open English WordNet☆459Updated last week
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- linguistics tree drawing to SVG in python, aimed at Jupyter☆59Updated last month
- The official repository for the The Project Dialogism Novel Corpus, a dataset of annotated quotations in full-length English novels.☆38Updated 11 months ago
- a python package for cleaning Gutenberg books and dataset☆30Updated last year
- Collaborative data curation for Glottolog☆149Updated last month
- The Global WordNet Association Collaborative Inter-Lingual Index☆40Updated 3 months ago